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OM nucleic - nucleic search, using sw model 



Run on: May 14, 2004, 00:22:53 ; Search time 10175.3 Seconds 

(without alignments) 
12622.391 Million cell updates/sec 

Title: US-09-931-157-2 
Perfect score: 4301 

Sequence: 1 gagacattccggtgggggac ctgggaaaaaaaaaaaaaaa 4301 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 55026578 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : EST:* 

1: em_estba:* 

2: em_esthum:* 

3: em_estin:* 

4: em_estmu:* 

5: em__estov:* 

6: em_estpl:* 

7: em_estro:* 

8: em_htc:* 

9: gb__estl:* 
10: gb_est2:* 
11: gbjitc:* 
12: gb_est3:* 
13: gb__est4:* 
14: gb_est5:* 
15: em_estf un : * 
16: em_estom:* 
1 7 : em_gs s_hum : * 
1 8 : em_gs s_inv : * 
19: ern_gss_pln : * 
20: em_gss_vrt:* 
21: em_gss_fun:* 
2 2 : em_gs s__mam : * 
2 3 : em_gs s_mu s : * 
24: em_gss_pro:* 
25: em_gss_rod: * 
26: em_gss_phg:* 
2 7 : em_gs s_vr 1 : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No. 


Score 


Match 


Length 


DB 


ID 


Description 




1 


1372.2 


31. 


9 


3878 


11 


AK083415 


AK083415 Mus muscu 




2 


1371.2 


31. 


9 


3990 


11 


AK085532 


AK085532 Mus muscu 




3 


1329 


30. 


9 


1329 


29 


AY415512 


AY415512 Homo sapi 




4 


1137.6 


26. 


4 


2521 


11 


AK082103 


AK082103 Mus muscu 




5 


1126.4 


26. 


2 


3611 


11 


AK085165 


AK085165 Mus muscu 




6 


1020 


23. 


7 


2669 


11 


AK076426 


AK076426 Mus muscu 




7 


996 


23. 


2 


1329 


29 


AY415514 


AY415514 Mus muscu 


c 


8 


987 


22 . 


9 


1201 


9 


AL571798 


AL571798 AL571798 




9 


936. 6 


21. 


8 


1144 


29 


AY415513 


AY415513 Pan trogl 


c 


10 


866 


20. 


1 


957 


12 


BI520706 


BI520706 603071813 


c 


11 


860.2 


20. 


0 


1201 


9 


AL553041 


AL553041 AL553041 




12 


851 


19. 


8 


942 


9 


AL543805 


AL543805 AL543805 




13 


848 


19. 


7 


891 


13 


BQ229233 


BQ229233 AGENCOURT 




14 


816.2 


19. 


0 


1201 


9 


AL546465 


AL546465 AL546465 




15 


808.6 


18. 


8 


972 


12 


BI858627 


BI858627 603389094 


c 


16 


802.4 


18 . 


7 


942 


9 


AL570142 


AL570142 AL570142 




17 


794.8 


18 . 


5 


884 


13 


BU557315 


BU557315 AGENCOURT 




18 


792 


18 . 


4 


911 


13 


BQ719386 


BQ719386 AGENCOURT 


c 


19 


788 


18. 


3 


1201 


9 


AL571072 


AL571072 AL571072 




20 


773.6 


18. 


0 


852 


13 


BU172663 


BU172663 AGENCOURT 




21 


772 


17. 


9 


1201 


9 


AL553065 


AL553065 AL553065 


c 


22 


766.6 


17. 


8 


775 


14 


CA771707 


CA771707 io81f04.x 




23 


741. 8 


17. 


2 


770 


12 


BM014035 


BM014035 603639686 


c 


24 


740.2 


17. 


2 


942 


13 


BX345882 


BX345882 BX345882 




25 


738 


17. 


2 


999 


13 


BX417121 


BX417121 BX417121 




26 


736 


17. 


1 


1201 


9 


AL545283 


AL545283 AL545283 




27 


734 .2 


17. 


1 


978 


13 


BQ683643 


BQ683643 AGENCOURT 




28 


732 . 8 


17. 


0 


1121 


12 


BM926545 


BM926545 AGENCOURT 




29 


729.2 


17. 


0 


758 


12 


BM014042 


BM014 042 603639695 




30 


727.8 


16. 


9 


885 


12 


BG769122 


BG769122 602743382 




31 


718 


16. 


7 


785 


9 


AU117045 


AU117045 AU117045 




32 


712.8 


16. 


6 


743 


9 


AU138228 


AU138228 AU138228 




33 


708.8 


16. 


5 


961 


12 


BM804821 


BM804 821 AGENCOURT 


c 


34 


706.2 


16. 


4 


800 


9 


AI760041 


AI760041 wg57e06.x 




35 


705 


16. 


4 


716 


9 


AL699988 


AL699988 DKFZp686K 


c 


36 


703 


16. 


3 


722 


12 


BM970305 


BM970305 UI-CF-EC1 


c 


37 


698.8 


16. 


2 


726 


9 


AI422064 


AI422064 tf57cl2.x 


c 


38 


697.4 


16. 


2 


866 


9 


AI188458 


AI188458 qdl4d01.x 




39 


696.8 


16. 


2 


771 


9 


AU116904 


AU116904 AU116904 




40 


691.4 


16. 


1 


934 


13 


BQ718019 


BQ7 18019 AGENCOURT 




41 


686.4 


16. 


0 


839 


9 


AU136164 


AU136164 AU136164 


c 


42 


679.4 


15. 


8 


771 


9 


AI567763 


AI567763 tr62c07.x 


c 


43 


673 


15. 


6 


699 


12 


BM974913 


BM974913 UI-CF-EC1 


c 


44 


662.6 


15. 


4 


751 


9 


AA651686 


AA651686 nn47b02.r 




45 


661.6 


15. 


4 


941 


13 


BX345883 


BX345883 BX345883 



ALIGNMENTS 



RESULT 1 
AK083415 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 



AK083415 3878 bp mRNA linear HTC 20-SEP-2003 

Mus mus cuius 9 days embryo whole body cDNA, RIKEN full-length 
enriched library, clone : D030003K13 product : ENDOTHELIN B RECEPTOR 
PRECURSOR, full insert sequence. 
AK083415 

AK083415. 1 GI : 26350536 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki, Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata, K. , 



Itoh,M., Aizawa,K., 
Konno,H., Akiyama,J., Nishi,K., 
Sumi,N., Ishii,Y., Nakamura,S., 



Nagaoka,S., Sasaki, N., Carninci,P., 
Kitsunai,T., Tashiro,H., Itoh,M., 
Hazama,M., Nishine,T., Harada,A., 
Yamamoto,R., Matsumoto, H. , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M. , 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki , Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FAN TOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FAN TOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 3878) 



AUTHORS Ada chi, J., Aizawa,K., Akimura,T. , Arakawa,T., Bono,H., Carninci,P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo, S . , Konno, H . , Kouda,M. , 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M., 
Nakamura,M., Nishi,K., Nomura, K. , Numazaki, R. , Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume, N . , 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T,, Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 

TITLE Direct Submission 

JOURNAL Submitted ( 16-APR-2 002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC), 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan ( E-mail : genome-res@gsc . riken . go . jp, 
URL :http: //genome. gsc. riken.go.jp/, Tel : 81-45-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / /genome . gsc . riken . go . jp/ 
URL:http: //f antom. gsc. riken. go . jp/ - 
FEATURES Location/Qualifiers 
source 1 . . 3878 

/organism="Mus musculus" 

/mol_t ype= "mRNA" 

/strain="C57BL/6J" 

/db_xref="FANTOM_DB:D0300O3K13" 

/db_xref="MGI: 2418502" 

/db_xref="taxon: 10090" 

/clone="D030003K13" 

/tissue_type="whole body" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="9 days embryo" 
CDS 109. .1437 

/no te= "unnamed protein product; ENDOTHELIN B RECEPTOR 

PRECURSOR (SWISSPROTI P48302, evidence: FASTY, 100%ID, 

100%length, match=1326) 

putative" 

/ codon_start=l 

/protein_id="BAC38908 .1" 

/db_xref="GI : 26350537" 

/ trans lation="MQSPASRCGRALVALLLACGFLGVWGEKRGFPPAQATLSLLGTK 
EVMT PPTKTSWTRGSNSSLMRS SAPAEVT KGGRGAGVP PRSFPPPCQRNIEI SKTFKY 
INTIVSCLVFVLGI IGNSTLLRI I YKNKCMRNGPNILIASLALGDLLHI I IDI PINTY 
KLLAEDWPFGAEMCKLVPFIQKAS VGITVLSLCALS I DRYRAVASWSRI KGI GVPKWT 
AVEIVLIWWSWIAVPEAIGFDMITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWW 
LFSFYFCLPLAI TAVFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYDQSNPHRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS " 
polyA signal 3859. .3864 



polyA_site 
ORIGIN 



/note= "putative" 
3878 

/note= "putative" 



Query Match 31.9%; Score 1372.2; DB 11; Length 3878; 

Best Local Similarity 66.0%; Pred. No. 8.1e-254; 

Matches 2665; Conservative 0; Mismatches 1043; Indels 327; Gaps 34; 

Qy 193 AAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCA 252 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II 

Db 64 AAACAGCAGAGCGGCTACCAGACTCTCACAGGAGCAAGCTGTAACATGCAATCGCCCGCA 123 

Qy 253 AGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGG 312 

II I I I I II I I I I I I I I MM I I I I I I I II I I I I I III I I I II I III 

Db 124 AGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGCTTCTTGGGGGTATGG 183 

Qy 313 GGAGAGGAGAGAGG CTTCCCGCCT GAC AGGGC C ACT C CGCTTTTGCAAACCGCAGAG 369 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I 

Db 184 GGAGAGAAAAGAGGATT C C C AC CT GC C CAAG C C AC GCT GT CACT T CT C G GGACT AAAGAG 243 

Qy 37 0 ATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGG 429 

I I I II I I I I M I I I I II I I I M II III Ml I I I I M I II I I I M I M Ml 

Db 244 GT AAT GAC GCCACC CAC T AAGAC CT C CT G GAC C AGAGGT T C CAACT C C AGT CT GAT G C GT 303 

Qy 430 TCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACC 489 

Ml I I I I I I I II I I I I II I I I I I I I I III I II I I I I I M I I I I 
Db 304 TCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGAGTCCCGCCAAGATC- 362 

Qy 490 AT CTCCCCTCCCCCGTGC CAAG GAC C CAT C GAG AT CAAGGAGAC T T T CAAAT AC AT CAAC 549 

II I I I I I I I I M I M I I II II M I M I I I II I M I I II I M II I II I 

Db 363 — CTTCCCTCCTCCGTGC CAAC GAAATAT T GAGAT C AGCAAGACTT T T AAAT AC AT CAAC 420 

Qy 550 ACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGA 609 

III I I I II I I I I I I I I II I I II I I I I M II II I I I II II I I I M I II M III 

Db 421 ACGATTGTGTCGTGCCTCGTGTTCGTGCTAGGCATCATCGGGAACTCCACGCTGCTAAGA 480 

Qy 610 AT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C CCAAT AT CT T GAT C GC C AGC T T GGC T 669 

II I I M I I II II I I I I I I I I I II I I II I I II I I I I I II I I I I II I I I I I I I I II I 

Db 481 ATCATCTACAAGAACAAGTGCATGCGCAATGGTCCCAATATCTTGATCGCCAGTCTGGCT 540 

Qy 670 CT G GGAGAC CT GCT GCACAT C GT CAT T GACAT C C CT AT CAAT GT CT ACAAGCT GCT G G CA 729 

I I II M I II II I I I I II I I I MM Mill II II II M I I I M I I II III 

Db 541 CT G GGAGAC CTACT G C ACAT CAT CAT AGACATAC C CAT TAAC AC CT ACAAGT T GCT C GCA 600 

Qy 730 GAG GACT GGC C ATT T GGAGCT GAGAT GT GTAAG CTGGTGCCTTT CAT AC AGAAAGC CT C C 789 

I II II II II M II II I I II II I I II I I M M M I I I II II I II I I I II I I II II II 

Db 601 GAG GACT GGC C ATT T GGAGCT GAGAT GT GTAAG CT G GT G C C CT T CAT AC AGAAG GCT T C T 660 

Qy 790 GT G G GAAT CACT GT G CT GAGT C TAT GT G C T CT GAGT AT T GACAGAT AT C GAGCT GTT GCT 849 

II I I I II I I I I I I I I II I II I I I I I I I I I I II I II I I II II I I II I II I II II I M I 

Db 661 GT G G GAAT CAC AGT G CT GAGT CTTTGTGCT CT AAGT AT T GACAGAT AT C G AGC T GT T GCT 720 

Qy 850 T CT T G GAGT AGAAT T AAAG GAAT T GG GGT T C CAAAAT GGACAGC AGT AGAAAT T GT T T T G 909 

II I I I I I II I II I I I II I I I I I I I I II I I II II I I I I I I I I I I I II I I I II II I I II I 
Db 721 T CT T G GAGT C GAAT T AAAG GAAT TGGGGTTC CAAAAT G GAC AGC AGT AGAAAT T GT TT T A 780 



Qy 910 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 969 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I 
Db 781 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGTTTTGATATGATTACG 84 0 

Qy 970 AT GGACT ACAAAGGAAGT TAT CT G C GAAT CTGCTTGCTT CAT C C C GT T C AGAAG AC AGCT 1029 

I I I I I I II I I I I I 1 II I I I I I I I I I I I I I II I I I II I II I I I I I 

Db 841 TC GGACT ACAAAGGAAAGCCCCTAAGGGTCTGCATGCTTAATCCCTTTCAGAAAACAGCC 900 

Qy 103 0 TTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTG 1089 

I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I M I I I I I II I I I I I I I I I I I I 

Db 901 TTCATGCAGTTTTACAAGACAGCCAAAGATTGGTGGCTGTTCAGTTTCTACTTCTGCTTG 960 

Qy 1090 CCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTGAGAAAGAAA 1149 

II I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I II I I | I I 

Db 961 C C GCT AGC CAT C AC T GC AGT CT T T TAT AC C CT GAT GAC C T GC GAAAT G CT CAGGAAGAAG 102 0 

Qy 1150 AGT G GC AT G C AGAT T GCT T T AAAT GAT C AC CT AAAGCAGAGAC GGGAAGT G GC CAAAAC C 12 09 

II II I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I I I I I I I I II 
Db 1021 AGCGGTAT GCAGATT GCTTT GAAT GAT CACTT AAAGCAGAGAC GAGAAGT GGCCAAGACA 1080 

Qy 1210 GTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATT 1269 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I 
Db 1081 GTCTTCTGCCTGGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTTCACCTCAGCCGGATC 1140 

Qy 127 0 C T GAAGC T CAC T CT T T AT AAT C AGAAT GAT CC CAAT AGAT GT GAAC T T T T GAGC T T T CT G 132 9 

I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I 
Db 1141 CTGAAGCTCACCCTGTATGACCAGAGCAATCCACACAGGTGTGAGCTTCTGAGCTTTTTG 12 00 

Qy 1330 T T GGT AT T GGACT AT AT T G GT AT CAACAT G GCT T CACT GAAT T C CT GC AT TAAC C CAATT 138 9 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I 1 I I I MM II II II II II Mill 
Db 1201 T T GGT TT T GGACTACAT TGGTAT CAACAT GGCTTCTTTGAACTCCTGCAT CAAT C CAAT C 1260 

Qy 1390 GCT CTGTATTT GGT GAGCAAAAGATTCAAAAACT GCTTT AAGTCATGCTT AT GCTGCTGG 1449 

I II II I II I I II II I I I I I II M II II I I I I I I I I M II I M I I I I I II I I I II I I I I 

Db 1261 GCT CTGTATTT GGT GAGCAAAAGATTCAAAAACTGCTTTAAGTCATGTTTGT GCTGCTGG 132 0 

Qy 1450 T GC CAGT CAT T T GAAGAAAAAC AGT C CTT G GAG GAAAAGC AGT C GT GCT T AAAGT T C AAA 1509 

II I I I I Mill I II II I II I I II I I II I I I I I I I II I I III I II I I II M I 

Db 1321 T GC CAAAC GT T T GAGGAAAAGCAGT C CTT G GAGGAGAAG CAGT C C T GC CT GAAGT T CAAA 138 0 

Qy 1510 G CT AAT GAT CAC GGAT AT GACAACTT C C GT T C CAGT AAT AAAT AC AG CT CAT CTT GAAAG 1569 

II II I II I I I I I II II I II I I I I II II I II II I I I I II I I II I M I I I I II I I I 
Db 1381 G C CAAC GAT CAC G GAT AT GACAACTT C C GGT C CAG CAAT AAAT AC AG CTCGTCTT GAAGG 144 0 

Qy 157 0 AAGAACT AT T CAC T GT AT T T CAT T T T C T T TAT AT T GGAC C GAAGT CAT T AAAAC AAAAT G 1629 

I I I I I I I II II I I II I I I II I II I I II II II II II I I II I 
Db 1441 CAAGAACACT C GC C GAAT CT CACT GT C CT CAT T GT GGACAGAT AC CAT T AAAAC AAAAT G 1500 

Qy 1630 AAACAT T T GC CAAAACAAAAC AAAAAACT AT GT AT T T GC ACAGCACACT AT T AAAAT ATT 1689 

II I I I I I I II I I I II I I II II I Mill I II 

Db 1501 AAACCGTTGCCAAATCAAAATGGAAAAAACCATGCTAGCAGAAAGGTGTGCGCGCGTGTG 1560 

Qy 1690 AAGT GTAATTATTTTAACACT CACAGCTACATATGAC ATTTTATGAGCTGTTTAC 1744 

I III I I I I I II I I I I II I I I II II I I I I 

Db 1561 AGAG GGAT TAT T T T TAACT GT T C T GAC GCT CAAC AC C G GAT AT AT T CAC GGGCTGTT T AC 1620 

Qy 1745 GGC AT G GAAAGAAAAT CAGT G GGAAT T AAGAAAG C CT C GT C GT GAAAG C ACT T AAT T T TT 1804 



II II I I 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 III 

Db 1621 AACCTAAGAAAGCTGTGGGAAGGAATGAAGCCCTCCTCCGTGGGGAAGCACTTAGATTCT 1680 

Qy 1805 TACAGTTAGCACTTCAACATAGCT CTTAACAACTT CCAGGATATTCACACAACACTTAGG 18 64 

I II I I I I I I I I M I I I I I M I I I I I I I 

Db 1681 T— AGTCAGCACTTCAGCAGAGCTCTTAAAAGCCCCTAGTGCGTTCACATGCCACTTACG 1738 

Qy 18 65 CT T AAAAAT GAG CT C ACT CAGAAT TT CT AT T CT T T CT AAAAAGAGAT T TAT T T T TAAAT C 1924 

I I I I I I I I I I I I I I I I I I I II I I I I 

Db 1739 TTTAAAAA AAC GAGAAC TT C ACT GAAGT T CT GT T C AG GAGT T TAT TAT C C AGT 1791 

Qy 1925 AAT GGGACT CT GAT AT AAAGGAAGAATAAGT C ACT GT AAAAC AGAACT T T TAAAT GAAG C 1984 

I I I I I II I I I I I I I I I I I I I I M I I I I Ml M I 
Db 1792 C C TAT GAAT C T G GAT T C AAGAAAG CAT — GACATT GCAAAACAATTCTTAAAACGAAGTT 1849 

Qy 1985 T TAAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAAC AACT T T T CAAT TAAT AT TAT C 2044 

I I I I I I I I I I I I I MM I I I I I I M I M I I I I 

Db 18 50 T CAAT T GCTTAAT T T GAAACT T AAAAAAAAAAAAACT AATAAAT T T T TAT GCAT ACT AT C 1909 

Qy 2045 — ACACT AT TAT CAGATT GT AAT T AGAT GCAAAT GAGAGAGC AGT T T AGT T GT T G C A- T T 2101 

I I I I I I I II I II I II M M I II M M I I M I I II I I I Ml 
Db 1910 AT AC C C ACT AAT CT GAT T GT AACT AT AT GCAAAAGAAAAGG CAAT AT GGT T GGT AAACT T 1969 

Qy 2102 T T T C GGACACT GGAAACAT T TAAAT GAT CAG GAGGGAGT AACAGAAAGAGCAAGGCT GT T 2161 

I I I I I I I I II II I I I I I M M I II I I II I I I 
Db 197 0 TT T T GGT CAT T AC CAACAT T GAAAT GAT CAGAAT T C GG G G GAAGAAA 2016 

Qy 2162 T T T GAAAAT CAT T AC ACT T T C ACT AGAAG CC CAAAC CT C AGCAT T CT GCAAT AT GTAAC C 2221 

I III 

Db 2017 AGACAGCC 2024 

Qy 2222 AACAT GT CACAAACAAGC AG CAT GTAAC AGACT GGCAC AT GT GC CAG CT GAAT T T AAAAT 2281 

I I MM I M M I I I II 
Db 2025 TGCGAAT GCCACAGAGAAAACAT GGGAAAGCGT G 2058 

Qy 2282 AT AATACTTTTAAAAAGAAAATTATTACATC CTTTACATT CAGTTAAGAT CAAACCT CAC 2341 

I II I I I I I I I I I I I I I I I 

Db 2059 AGCT GC T AT GC CT GAGACT T CT GAAAT T C C CT CACACAT ACT CT GC AG 2106 

Qy 2342 AAAGAGAAATAGAAT GT TT GAAAGGCT ATCC CAAAAGACTTTTTT GAAT CT GT CATT CAC 2401 

| I I I I I I I III I I I I I I I II I I I I I I 

Db 2107 AAAGACACAAA AC AGAAC ACT AC CT AT GAT T T CTT T AAAGT T CT T T CAAAT 2157 

Qy 2402 AT AC C CT GT GAAGACAAT ACT AT CT ACAATT T T T T CAGGAT TAT TAAAAT CTTCTTTTTT 2461 

Ml I I II II I I II I I I II I I I I 
Db 2158 AT C CT T T CAT GAT T GAAGT T TAAAT T C CAT GT GT T CAACTT CAT C A 2203 

Qy 2462 CACTATCGTAGCTTA7\ACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACAC 2521 

I I II I II I I I I I I I III 

Db 2204 TCTGTAAATACTTAGCTATTAGCTATAAGCAC 2235 

Qy 2522 T G CAT GT AG AT GAT TAAAT G A — GGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGA 257 9 

I M I II I I I I I I I I II I I I I I I II I I I I I I I II I I I I I 
Db 2236 T ACAC GT AGAG GACT T AACAAAGG G CAG GT C C C AG CGT T C GT AG CT T T C T GACAAAGAGA 2295 



2580 T G C C AGT G AC C T CAT AAT — AAAGACT GT GAACT G C CT GGT GC AGT GT C CAC AT GACAAA 2637 
I I I I II I I I I III I II I I I I M I I I I I I II I I I I I II II I I I I I I I M 



Db 2296 T G C C AGT AAC C C GGT T AT AGACAGAAT GT GAAT T G C C C GGT GC AGT GT C C AC AT G GCAAA 2355 

Qy 2638 GGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTAT 2697 

| | | | M M I I I I I II I M I I I I II I I I I I I I I I I Ml I 
Db 2356 GAAGCAGGGAGCATC — CTTTCAGCCATGCTGTAGAGAAAATGGTCCACAGC AC 2407 

Qy 2698 AAT G CT AT AGT T AAAAT AC TAT T T T T C AAAAT C AT AC AGAT T AGT - AC AT T T AAC AG C T A 2756 

I I M I I I I I I II II I I I I I M I I I 

Db 2408 AATATGATAGCGAAAATACCGTGGTTTAACGCCATAGAAAATAGTCACTGTAACCAGCTC 2467 

Qy 2757 CCTGTAAAGCTTATTACTAA-TTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTT 2815 

|| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 24 68 TCTCGGAGGCATACTACCAACTTTTTATGTTATTCCTGAAAATAGCCAATAGAAAGGCGT 2527 

Qy 2816 GCT T GAC AT GGTGCTTTTCTTT CAT CTAGAG GCAAAACT G CT T T T T GAGAC C GTAAGAAC 2 875 

M I I I I I I I I I I I I I I I II I I II I I I I M I I I 

Db 2528 TCTGGACATGGTGCTTTTTCTAAAACGTAGAAGCCAAACTGCTTCGGGGTCTGCAAGATC 2587 

Qy 2 87 6 CTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAA 2923 

I I I I I I I I I I I I I I II I I I I I I I I I I I I M 
Db 2588 CTCCT — CTTTGCGCATTCTTGTCTAGGTTTTTTTTTTTTTTTTTTAATCTCCTTCCACG 2645 

Qy 2 924 — GT GC CT T AGGAT AG CT T G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGA 2 981 

I M II I I I I I I II I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 2646 AC GT GC CT T AGGTT C ACT C C GGAT GAGC GGT GT GT GAAAGAAT GC C CAAGAGAAAACT GA 2705 

Qy 2982 AGAGAGAGGAAAT GAG GT GG GGT T GGAG GAAAC C CAT G GGGACAGAT T C C CAT T CT T AGC 3041 

|| I I I I I I I I I II I I I I I I I I I I I M I I IN MINI I I I I I I I I I I I I I I I I 
Db 2706 AGAGAGAGGAAAT GAGGT GGGGC C AGAGGAAGC C C GT GGGGAAAT AT T C C CAT T CT T AGC 2765 

Qy 3042 CTAAC GT T C GT CAT T GC CT C GT C ACAT CAAT GCAAAAGGT C CT GAT T T T GT T C C AGC AAA 3101 

I I I I I I I I I I I I I I II I I II II I M I I I I I I I I I I I I I I M I I I 

Db 2766 CCTGTGTTCGTCACTGCCACGTCATGTCGGTGTGAAAGGTCCTGGTTCGGCTCCAGCAAA 2825 

Qy 3102 ACAC AGT GCAAT GT T CT C AGAGT GAC T T T C GAAAT AAATT GGGC C CAAGAGCT TTAACT C 3161 

M | | I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

Db 2826 ACAAAGCGCAGCGTTCTCAGCGTGAC-TCGGGAACAAACCAAGCCCGAGAGCTTTAACCT 2884 



Qy 



3162 GGTCTTAAAATATGCCCAAATTTT 3185 

I I I I I I I I I I I I I I Ml 
Db 2885 TGTCTTAAAATATAACAGATTTTCCTTCCTTCCTTTTTCTCTTTCTTCTCTTCTCTTCTC 2944 



Qy 



3186 TACTTTGTTTTTCTTTTAATAGGCTGGGCCACATG 3220 

I I I I I I I II I I I I I I I I II I I I I I I 
Db 2945 TTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTTTTCATAACCCAGGCCACATG 3004 



Qy 3221 TTGG7WVTAAGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACC 3280 

I | | || I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I 

Db 3005 TTGA7WVTGAGCTTAACAATGCAGTTTTCTACCAAAATCATTGTGACAATACAATAAACC 3064 

Qy 3281 AAAAC C CAACAAT GT G GC C AGAAAGAAAGAG CAAT AAT AAT T AAT T C ACAC AC CAT AT GG 3340 

MM I M I II II I M I I M I II M I I I I I I I I I 

Db 3065 C AAAC GGGACAAT GAG GT AAAAAAC CAAGAAC AAT ACT GAAT C C AC GT GAC AC ATG 3120 

Qy 3341 ATTCTATTTATT^AATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAG 3400 

I II I M I I I I I I I II I I I I I I I I I 

Db 3121 ACTCTCTTTAGGAGTCACCCACAGTTCTTGTGTGTA CAGAT 3161 



Qy 


3401 


AG GC CT GT T AT CAT AGAAGT CAT T T T AGACT CT CAAT T T T AAATT AAT T - T T GAAT C ACT 


3459 


Db 


3162 


1 1 1 1 1 1 1 1 1 1 1 1 IN 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M III 

TGCTTTTTAATCATAAAGGACGCCCCAGATCTTCAATTTTAAGTTAGTTATTGGCTCCCC 


3221 


Qy 


3460 


AAT AT T T T C AC AGT T TAT T AAT AT AT T T AAT T T C TAT T T AAAT T T TAG AT TAT T T T TAT T 

1 1 1 I I I I I I M 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

AGT AGT T T CAC AGC GT GGAT AT AT T T T TAAT T T T T A- CT AAGT T T T AGATT GGT T T TAT T 


3519 


Db 


3222 


3280 


Qy 


3520 


AC CAT GT ACT GAAT T TT T ACAT C CT GAT AC CCTTTCCTTCTC CAT GT CAGTA 

I M II 1 1 1 1 IN M 1 1 1 1 1 1 1 1 M 1 1 1 
GTTGTGTTCTAAATTCTTAAGTCCTAACATCTTTGTTTAACCCAGATGTTCCTTCCCTCT 


3571 


Db 


3281 


3340 


Qy 


3572 


T CAT GT T CT CT AAT TAT CTT GCCAAAT T T T GAAAC T AC AC AC AAAAAG CAT ACT T GCAT T 

| | I I I I INI 1 1 M II 1 1 1 1 1 1 1 1 1 1 Ml III M 1 II 

T CAT GGGCAATAAT CGT C CT GCCAAATT AT GAAAT GGCATAAGAATACT ATT CACATAAT 


3631 


Db 


3341 


3400 


Qy 


3632 


AT T T AT AAT AAAAT T GCAT T C AGT GGCT T T T T AAAAAAAAT GT T T GAT T CAAAACTT T AA 

I I M II II 1 1 1 1 1 M 1 II 1 1 1 1 II M 1 MM III M 1 

AT AT ACAAT AAAACT AT AT TAAGT GGCT TT T T TAT TAAAAAT T T T AGC AC A CAG 


3691 


Db 


3401 


3454 


Qy 


3692 


CAT ACT GAT AAGT AAGAAAC AAT T AT AATT T CT T T AC AT ACT C AAAAC C AAGAT AGAAAA 

1 | | | I I M 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 M M II 1 II II 

ACCAAGGGT GATAAGAAAAAAAACAT GATT CCCTT GCATAATTAAAACCAAGATAAGAGA 


3751 


Db 


3455 


3514 


Qy 


3752 


AGGTGCTATCGTTCAACTTCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAAC 

1 MUM 1 1 1 1 1 M M II II II 1 

AGGTACCATCT AAT T T AAAGCAT ATT T T CT AACAT T T AAGT AGCCTAAT AT AG CAAT 


3811 


Db 


3515 


3571 


Qy 


3812 


AGACAAAAT TAT T GT T AAC AT GGAT GT T AC AGCT CAAAAG AT T T AT AAAAGAT T T TAAC C 

| MM M 1 II 1 II II II M II II II 1 1 1 1 1 II 1 II Ml II M 1 II 

G CAT AAAAAT AGT GT T AAC AAGGAT GT T AGAGGT CAAAC GAT T T GTAAGT GACT T CAGC C 


3871 


Db 


3572 


3631 


Qy 


3872 


TATTTTCTCCCTTATTATCCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATT 

| | M II 1 II II II 1 1 1 II II II 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 M 1 II 1 
TATTTTCTCCCAAATTATTTACTGCTATTTTTGGTCTGTGTTCAAACA-TTTTCAGTATT 


3931 


Db 


3632 


3690 


Qy 


3932 


GAT AG CTT AC AT AT GGCCAAAGGAATACAGTT TATAGCAAAACAT GGGTAT GCT GTAGCT 

MM Illllll II MM Mill MM II MM II 

GAT AAT GT GC A- AC AG C CAAAGGAAC ACT GT T T T CAT C CAAAT GCGGGTGTGTT GT ACCT 


3991 


Db 


3691 


3749 


Qy 


3992 


AACT T T AT AAAAGT GT AAT AT AAC AAT GT AAAAAAT TAT AT AT CT GG GAGGAT T T T T T GG 

Ml M 1 II 1 M II 1 1 II M 1 M II III 1 1 1 II 

AAC-— ATGCACTTGTAATAAAGCCGTGTAAAA— TAACTGTGTTTTGTTTTGCTCTGG 


4051 


Db 


3750 


3803 


yy 


4052 


TTGCCTAAAGTGGC TAT AGT TACT GAT T T T T TAT TAT GT AAG C AAAAC C AAT AA 

| || II II 1 II M 1 1 1 1 1 M 1 1 II 1 II II 1 II II II 1 M 

TCACCTAAAGTGGCAGCTTGTGTCGTT GCT AACTT CTT GTTGAGTAAGCAAAAC CAAT AA 


4105 


Db 


3804 


3863 


Qy 


4106 


AAATTTAAGTTTTTT 4120 




Db 


3864 


1 Mill III 
ACGTTCAAATGGTTT 387 8 
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Mus musculus 0 day neonate kidney cDNA, RIKEN full-length enriched 
library, clone : D630038G12 product : ENDOTHELIN B RECEPTOR PRECURSOR, 
full insert sequence. 
AK085532 

AK085532.1 GI: 26351656 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 
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Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Munnae; Mus. 
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20530913 
11076861 
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The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 
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The FANTOM Consortium and the RIKEN Genome Exploration Research 
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Analysis of the mouse transcriptome based on functional annotation 
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Nature 420, 563-573 (2002) 
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Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M., 
Nakamura,M., Nishi,K., Nomura, K, , Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 



FEATURES 

source 



Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M. , Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y. f Tanaka,T., Tomaru,A., Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 
TITLE Direct Submission 

JOURNAL Submitted ( 16-APR-2002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN), Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome- res @gsc. riken. go. jp, 
URL : http : //genome . gsc - riken . go . jp/ , Tel : 81-45-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / / genome .gsc. riken .go.jp/ 
URL : http : / / f ant om .gsc. r i ken . go . j p/ . 
Location/Qualifiers 
1. .3990 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref ="FANT0M_DB : D630038G12 " 
/db_xref ="MGI : 2422 642 " 
/db_xref= M taxon: 10090" 
/clone="D630038G12" 
/tissue_type=" kidney" 

/clone lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="0 day neonate" 
CDS 222. .1550 

/note="unnamed protein product; ENDOTHELIN B RECEPTOR 

PRECURSOR (SWISSPROTI P4 8302, evidence: FASTY, 100%ID, 

100%length, match=1326) 

putative" 

/codon_start=l 

/protein_id="BAC39465.1" 

/db xref="GI:26351657" 

/translation="MQSPASRCGRALVALLLACGFLGVWGEKRGFPPAQATLSLLGTK 
EVMTPPTKTSWTRGSNSSLMRSSAPAEVTKGGRGAGVPPRSFPPPCQRNIEISKTFKY 
INTIVSCLVFVLGIIGNSTLLRII YKNKCMRNGPNILIASLALGDLLHIIIDIPINTY 
KLLAEDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWT 
AVEIVLIWWSWIAVPEAIGFDMITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWW 
LFSFYFCLPLAITAVFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYDQSNPHRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 

polyA__signal 3972. .3977 

/note="putative " 
polyA_site 3990 

/note="putative" 

ORIGIN 



Query Match 31.9%; Score 1371.2; DB 11; Length 3990; 

Best Local Similarity 66.0%; Pred. No. 1.3e-253; 

Matches 2664; Conservative 0; Mismatches 1043; Indels 327; Gaps 



34; 



Qy 


193 


AAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCA 

Ml MM 1 1 i II II 1 1 1 1 1 1 1 M 1 

AAAC AGCAGAG C G G CT AC CAGACT CT CACAGGAGC AAG CT GT AAC AT G CAAT C GC C C GC A 


252 


Db 


177 


236 


Qy 


253 


AGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGG 

M 1 1 MM 1 1 1 1 1 1 1 1 II 1 1 1 1 

AGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGCTTCTTGGGGGTATGG 


312 


Db 


237 


296 


Qy 


313 


GGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTC— CGCTTTTGCAAACCGCAGAG 

| | || I I 1 1 1 1 1 1 II II 1 1 1 M 1 M 1 M 1 Mill M MM 

G GAGAGAAAAGAGGAT T C C C AC CT GC C C AAGCC AC G CT GT CACT T CT C G GGACTAAAGAG 


369 


Db 


297 


356 


Qy 


370 


ATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGG 

| || | I I I I I II II II 1 M 1 III 1 1 II II M 1 1 1 II 1 M 1 Ml 

GT AAT GAC GC C AC C C ACTAAGAC CT C CT GG AC C AGAGGTT C CAACT C C AGT CT GAT G C GT 


429 


Db 


357 


416 


Qy 


430 


TCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACC 

I I I | 1 1 II 1 1 M II II M 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 

TCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGAGTCCCGCCAAGATC- 


489 


Db 


417 


475 


Qy 


490 


ATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAATACATCAAC 

|| MUM 1 Ml M II 1 1 M Mill M 1 II 1 1 M II 1 

- - CT T C C CT C CT C C GT GC CAAC GAAAT ATT GAGAT CAGC AAGACT T T T AAAT AC AT C AAC 


549 


Db 


476 


533 


Qy 


550 


ACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGA 

Ml II II II 1 1 M 1 II 1 II II M II 1 II 1 M II M II II 1 M M Ml 

ACGATTGTGTCGTGCCTCGTGTTCGTGCTAGGCATCATCGGGAACTCCACGCTGCTAAGA 


609 


Db 


534 


593 


Qy 


610 


AT TAT CT ACAAGAAC AAGT G CAT GC GAAACGGT C C CAAT AT CTT GAT C GC C AG CT T G GCT 

M 1 1 1 1 II II II II M II 1 M M 1 1 M II 1 1 1 1 M 1 1 II M 1 II II 1 M 1 Mill 

AT CAT CT ACAAGAACAAGT GCAT GC GCAAT GGT C CCAAT AT CTT GAT C GC C AGT CT G GCT 


669 


Db 


594 


653 


Qy 


670 


CT GGGAGACCTGCTGCACATCGTCATTGACATCCCTAT CAAT GTCTACAAGCT GCT GGCA 

I || I I I I 1 1 1 1 1 M M II II MM Mill II II II II II M 1 MM Ml 

CT GGGAGAC CT ACT GC ACAT CAT CAT AG AC AT AC C CAT T AAC AC CT ACAAGT T GCT C GCA 


729 


Db 


654 


713 


Qy 


730 


GAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT ACAGAAAGCCTCC 

|| M || II 1 M II II II II 1 1 1 1 1 M 1 II 1 II 1 1 II M 1 II 1 II 1 M 1 II 1 1 II M 

GAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCCTT CAT ACAGAAGGCTTCT 


789 


Db 


714 


773 


Qy 


790 


GT GGGAAT CACT GT G CT GAGT CT AT GT GCT CT GAGT ATT GAC AGAT AT C GAGCT GT T G CT 

|| | | || 1 II M II M II II 1 1 1 1 M 1 II M 1 M 1 II II 1 II 1 1 1 1 1 1 

GT GG GAAT C ACAGT GCT GAGT CT T T GT GCT CT AAGT AT T GAC AGAT AT C GAGCT GT T GCT 


849 


Db 


774 


833 


Qy 


850 


T C T T G GAGT AGAAT T AAAGGAAT T GGGGT T CC AAAAT GGAC AGC AGT AGAAAT T GT T T T G 

I || | | M II M 1 M M II II II 1 M 1 1 II II 1 1 M 1 1 1 1 M II 1 1 II II M II II II 1 

T CT T GGAGT C GAAT T AAAGGAAT TGGGGTTC C AAAAT GGAC AGC AGT AGAAATT GT T T T A 


909 


Db 


834 


893 


QY 




ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 

M II II 1 1 II II II II II II II II 1 M II II 1 1 1 1 1 M MINI 

ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGTTTTGATATGATTACG 


969 


Db 


894 


953 


Qy 


970 


AT GGACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AGAAGAC AGCT 

I M | | || II II 1 M M 1 II 1 M 1 1 1 1 1 1 M M II 1 II 1 1 1 M 1 1 

T C GGACT ACAAAGGAAAG C C C C T AAGG GT CT GCAT G CT T AAT C C CT T T CAGAAAAC AGC C 


1029 


Db 


954 


1013 



1030 TTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTC^ 1089 

M I II I I I I I II I I I II I I I 1 I I I I I I I I I I M M I I I M I I I I I I I I I I I I I I I I M 
1014 UcIUciGUUAcXcAGCCAAAGATTGGTGGCTGTTCAGTTTCTACTTCTGCTTG 1073 



CCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTGAGAAAGAAA 



1090 CCATTGGCCATCACTGCATTTTTTTATAUAU i Arti «™-^< ^ ^ - " -— - - 1149 

II I I I I I I I I I I I I I I I I II I I I I II M I I I I I I I I I I I I I M II'' 1 
1074 CCGCTAGCCATCACTGCAGTCTTTTATACCCTGATGACCTGCGAAATGCTCAGGAAGAAG 1133 

1150 AGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACC 1209 

ll ll I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I II 

1134 AGCGGTATGCAGATTGCTTTGAATGATCACTTAAAGCAGAGACGAGAAGTGGCCAAGACA 1193 
1210 GTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATT 1269 

I,,.. I I | | | | || mil Mill I I MM II Ml II II MM II I I I I 

1194 gTcUcTGCcIgGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTTCACCTCAGCCGGATC 1253 

1270 CTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAGCTTTCTG 1329 

1 1 1 1 1 1 1 1 1 II II III I 1 1 1 1 1 1 1 1 I II 11111 111 1 1 1 1 1 1 1 1 11 

1254 CTGAAGCTCACCCTGTATGACCAGAGCAATCCACACAGGTGTGAGCTTCTGAGCTTTTTG 1313 

1330 TTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCAATT 1389 

I I I I I • ■ ■ i i ■ ■ | I I I l| | | | | || | | | | I I I I I I I I I I I I I M I I I I I I I I I 

1314 TTGGTTTTGGACTACATTGGTATCAACATGGCTTCTTTGAACTCCTGCATCAATCCAATC 1373 

1390 GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGG 1449 

MM Mill llllllllll Mill II II Ml II MMIMIII llll M miMiii 

1374 GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGTTTGTGCTGC 1433 

1450 TGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAA 1509 

Mill I II I I I IMM I II I I I I I I M I II II M I I M Ml I II M I I II I 
1434 TGCCAAACGTTTGAGGAAAAGCAGTCCTTGGAGGAGAAGCAGTCCTGCCTGAAGTTCAAA 1493 

1510 GCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCATCTTGAAAG 1569 

II I i i I || I I II M I II II II I M I I I Mill II II I II II II II I II II I I I I 
1494 GCCjJvCGATCACGGATATGACAACTTCCGGTCCAGCAATAAATACAGCTCGTCTTGAAGG 1553 

1570 AAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAAAACAAAATG 1629 

I | | m I I M Ml I II I M I I M I II I I I I I I Ml III M I 

1554 CAAGAACACTCGCCGAATCTCACTGTCCTCATTGTGGACAGATACCATTAAAACAAAATG 1613 
1630 AAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTATTAAAATATT 1689 



I I 



j_ x \j\s^j-ur^t u « - - 

II II II II M M I MM I Mill I I I 



1673 



1614 AAACCGTTGCCAAATCAAAATGGAAAAAACCATGCTAGCAGAAAGGTGTGCGCGCGTGTG 

1690 AAGTGTAATTATTTTAACACTCACAGCTACATATGAC ATTTTATGAGCTGTTTAC 1744 

I III I II II I I I I I I I I I I M M I M M 

1674 A.GAGGGATTATTTTTAACTGTTCTGACGCTCAACACCGGATATATTCACGGGCTGTTTAC 1733 

1745 GGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAATTTTT 1804 

II M I I II I I I M I I I M I I M II II I M I I I 

1734 AACCTAAGAAAGCTGTGGGAAGGAATGAA.GCCCTCCTCCGTGGGGAAGCACTTAGATTCT 1793 

1805 TACAGTTAGCACTTCAACATAGCTCTTAA.CAACTTCCAGGATATTCACACAACACTTAGG 1864 

I M I II I I I M M M M I I I I I I I I I III M M II I I I I I I I 
1794 T--AGTCAGCACTTCAGCAGAGCTCTTAAAAGCCCCTAGTGCGTTCACATGCCACTTACG 1851 

1865 CTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTAAATC 1924 



1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 I I 

1852 TTTAAAAA AAC GAGAAC T T C ACT GAAGT T CT GT T C AGGAGT T TAT TAT C C AGT 



1904 
1984 



1925 AAT G G GACT C T GAT AT AAAGGAAGAAT AAGT C ACT GTAAAAC AGAACT T T T AAAT GAAGC 

I III I I I I I I I I I I I Ml M I 

1905 C CT AT GAAT CT GGAT T CAAGAAAGC AT - - GAC AT T GCAAAACAAT T CT T AAAAC GAAGT T 1962 



1985 TT AAAT TACT CAAT TT AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT TAT C 
| M | | | II III II II I I I I I HI M III MM 

1963 T CAAT T G CT T AAT T T GAAACT T AAAAAAAAAAAAACTAAT AAAT T T T TAT GCAT AC TAT C 



2044 
2022 



2045 ACACTATT AT CAGATT GTAATTAGAT GCAAATGAGAGAGCAGT TT AGT T GTT GCA- TT 2101 

Ml | | M I I M I I I I II I I I I I I I I I I I I I I I I I M I IN 
AT AC C CACT AAT CT GAT T GT AACT AT AT GC AAAAGAAAAGG CAAT AT GGT T GGT AAACT T 



2023 
2102 



2082 
2161 



T TT C GGAC ACT G GAAAC AT TT AAAT GAT CAG GAGGGAGT AACAGAAAGAGCAAGG CT GT T 

III I I I I I I Mill I I Mill 

2083 T T T T G GT CAT T AC C AAC AT T GAAAT GAT C AGAAT T C G G G G GAAGAAA 21 Z3 

2162 TTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAACC 2221 
213Q AGACAGCC 2137 

2222 AACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAAT 2281 

I I M I I I MM III M 
2138 T G C GAAT G C C AC AGAGAAAAC AT G G GAAAG C GT G 

22 82 AT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T ACAT T C AGT T AAGAT CAAAC CT C AC 2341 

| || III I I I I I I I I I I I I 

21 72 AGCTGCTATGCCTGAGACTTCTGAAATTCCCTCACACATACTCTGCAG 2219 

2342 AAAGAGAAAT AGAAT GT TT GAAAGGCT AT C C C AAAAGACTT T T T T GAAT CT GT CAT T CAC 2401 

MMI | | | III I I I II M Ml II I I I 

2220 AAAGACACAAA AC AGAAC ACT ACCT AT GAT T T CT T T AAAGT T CT T T CAAAT 2270 

24 02 AT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T T T CAGGAT TAT T AAAAT CTTCTTTTTT 2461 

II I I I I I I I 11 1 

2271 AT C CT T T CAT GAT T GAAGT T T AAATT C CAT GT GT T CAACT T CAT C A 2316 

24 62 CACT AT CGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT ACCT ACAT ACAC 2521 

I I I I I II I II M I I III 

2317 TCTGTAAATACTTAGCTATTAGCTATAAGCAC 234 8 

2522 T GCAT GT AGAT GATTAAAT GA — GGGCAGGCC CT GT GCT CATAGCTTT ACGATGGAGAGA 2579 

| M | | | | | M I II I I I II I I M II I M 

234 9 TACACGTAGAGGACTTAACAAAGGGCAGGTCCCAGCGTTCGTAGCTTTCTGACAAAGAGA 



2408 



2580 T GC C AGT GAC CT CAT AAT - - AAAGACT GT GAACT GC CT G GT G C AGT GT C C ACAT GACAAA 2637 

I I I M II II I Ml I I I I I I II I I I I M M II I II I I I M M M I I II I 

2409 T GC C AGT AAC C C GGT TAT AGAC AGAAT GT GAAT T GC C C GGT GCAGT GT C C ACAT G GCAAA 2468 
2638 GGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTAT 2697 

I M II I II I I I M II I I I I I I II M I I I I I M M I I I I 

2469 GAAGCAGGGAGCATC — CTTTC AGC CAT GCT GTAGAG AAAAT GGT CCACAGC AC 2520 



2698 AAT G CT AT AGT T AAAAT AC TAT T T T T C AAAAT CAT AC AGAT T AGT - ACAT T TAACAGCT A 2756 
Ml MM II I II M I II M I I I I I I M I I M I I M I M 



2521 AAT AT GAT AG C GAAAAT AC C GT GGT T TAAC G C C AT AGAAAAT AGT C ACT GTAAC C AG CT C 2580 

2757 C CT GT AAAG CT TAT T ACT AA- T T T T T GT AT TAT T T T T GT AAAT AG C CAAT AGAAAAGT T T 2815 

|| | | | | | | | I I I I I M I I I I I I I M II I I I I I I I M I I II I I I 
2581 TCTCGGAGGCATACTACCAACTTTTTATGTTATTCCTGAAAATAGCCT^TAGAAAGGCGT 2 640 

2816 GC T T GAC AT GGTGCTTTTCTTT CAT CT AGAGG CAAAACT GCT T T T T GAGAC C GT AAGAAC 2875 

II I I I II I I I M I I I I I I 

2641 TCTGGACATGGTGCTTTTTCTAAAACGTAGAAGCCAAACTGCTTCGGGGTCTGCAAGATC 2700 

2876 CTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAA 2923 

Ml I II I I I I I I I I II I M I I I I I I I I I II 
2701 CTCCT— CTTTGCGCATTCTTGTCTAGGTTTTTTTTTTTTTTTTTTAATCTCCTTCCACG 2758 

2924 — GT GC CT T AGGAT AGCT T G G GAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGA 2981 

| | M I I II I I I M I I I I I M I I I I I I I I I II I I I I I I M I I I I II I I 
2759 AC GT GC CT T AG GT T CACT C C G GAT GAGCGGT GT GT GAAAGAAT GC C CAAGAGAAAAC T GA 2818 

2982 AGAGAGAG GAAAT GAGGT GG GGT T GGAGGAAAC C C AT GGGGACAGAT T C C CAT T CTT AG C 3041 

I II I I I I I I I I I I I Ml I I M M I I I M I I M I I 

2 819 AGAGAGAG GAAAT GAG GT GGGGC C AGAGGAAGC C C GT GGGGAAAT ATT C C CAT T CT T AG C 2878 

3042 CTAACGTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAA 3101 

| I I I I I I I I II I I I I I I I I I 

2879 C CT GT GTT C GT CACT GCCACGT CAT GTC GGT GTGAAAGGTCCT GGT TCGGCTCCAGCAAA 293 8 

3102 ACAC AGT GCAAT GT T CT CAGAGT GACT T T C GAAAT AAAT T GGGC C CAAGAG CT T T AACT C 3161 

| | | M II I I I I I I I I I I I I I I I I M III I I I I I I I I I M I I I I 

2939 ACAAAGCGCAGCGTTCTCAGCGTGAC-TCGGGAACAAACCAAGCCCGAGAGCTTTAACCT 2997 

3162 GGTCTTAAAATATGCCCAAATTTT 3185 

I I I I I I I I I I I I I I I M 
2998 TGTCTTAAAATATAACAGATTTTCCTTCCTTCCTTTTTCTCTTTCTTCTCTTCTCTTCTC 3057 

3186 TACTTTGTTTTTCTTTTAATAGGCTGGGCCACATG 3220 

I I I I I I I I I I I I I I I I I I I M I I I I 
3058 TTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTTTTCATAACCCAGGCCACATG 3117 

3221 T T GGAAAT AAGCT AGTAAT GT T GT T T T CT GT CAAT ATT GAAT GT GAT GGT ACAGT AAAC C 3280 

| M | || | | M I I I I I I M I I II I 

3118 T T GAAAAT GAGCTT AACAAT G C AGT T T T CT AC CAAAAT CAT T GT GACAAT AC AATAAAC C 3177 

3281 AAAAC C CAACAAT GT GG C CAGAAAGAAAGAGC AAT AAT AAT T AAT T CAC ACAC CAT AT GG 3340 

I I M I I I I M I I I I I I I I I I I I I I I I I I M I I I 
3178 C AAAC GGGAC AAT GAGGT AAAAAAC CAAGAAC AAT ACT GAAT C CAC GT GAC AC ATG 3233 

3341 AT T CT ATT TAT AAAT CAC C C AC AAACT T GT T CT T TAAT T T CAT CCCAAT C ACT T T TT C AG 3400 

I I I I I I I I I I I I II I I I I I I I I I I 
3234 AC T CT C T T TAG G AGT CAC C CAC AGT T CT T GT GT GT A CAGAT 3274 

3401 AG GCCT GT TAT CAT AGAAGT CAT T TT AGACT CT CAAT T T T AAAT TAAT T - T T GAAT CACT 3459 

| | | IIMM | | I III I II I I I I M I I I I I I I M III 

3275 TGCTTTTTAATCATAAAGGACGCCCCAGATCTTCAATTTTAAGTTAGTTATTGGCTCCCC 3334 

34 60 AAT AT T T T CAC AGT T TAT T AAT AT AT TT AATT T CT ATT T AAAT T T T AGAT TAT T T T TAT T 3519 

| M I I M I I I I I I I I I I I I I I I I I I I II I II I I I M I I I I I I I I I 

3335 AGTAGTTTCACAGCGTGGATATATTTTTAATTTTTA-CTAAGTTTTAGATTGGTTTTATT 3393 



Qy 3520 ACCATGTACTGAATTTTTACATCCTGATACCCTTTCCTTCTCCATGT CAGTA 3571 

III II I I I I IN M I I I I I II I Ml I I 
Db 3394 GTTGTGTTCTAAATTCTTAAGTCCTAACATCTTTGTTTAACCCAGATGTTCCTTCCCTCT 3453 

Qv 3572 TCATGTTCTCTAATTATCTTGCCAAATTTTGAAACTACACACAAAAAGCATACTTGCATT 3631 

Mill I I I I I M MINIMI II I II I M I M 

Db 3454 TCATGGGCAATAATCGTCCTGCCAAATTATGAAATGGCATAAGAATACTATTCACATAAT 3513 



ATTTATAATAAAATTGCATTCAGTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAA 
|| | | M II I II I N I I N I II I I II I I I I N Nl II 1 



3691 



Qy 3632 

I I I I I I I I I I 1 i ill mi iii i ' ■ ■ ■ 

Db 3514 ATATACAATAAAACTATATTAAGTGGCTTTTTTATTAAAAATTTTAGCACA CAG 3567 

Ov 3692 CAT ACT GAT AAGT AAGAAACAAT T AT AAT T T CTTT ACAT ACT C AAAAC CAAGAT AGAAAA 3751 

| | | | | I M II I I I II I N I I I I I I I I I I I M 

Db 3568 ACCAAGGGTGATAAGAAAAAAAACATGATTCCCTTGCATAATTAAAACCAAGATAAGAGA 3627 

Qv 3752 AGGTGCTATCGTTCAACTTCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAAC 3811 

I I I | | Ml II II II I I II I NNI I INN 

Db 3628 AGGTACCATCT— AATTTAAAGCATATTTTCTAACATTTAAGTAGCCTAATATAGCAAT 3684 

Qv 3812 AGACAAAATTATTGTTAACATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAA.ee 3871 

| II II II !IM! Ml II I III II II l I Ml Mil I ! i II II I II 
Db 3685 GCATAAAAATAGTGTTAACAAGGATGTTAGAGGTCAAACGATTTGTAAGTGACTTCAGCC 3744 

3972 TATTTTCTCCCTTATTATCCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATT 3931 

Ml Ill INN HI I III I I I I I II I II III I I Ml Mill 

Db 3745 TATTTTCTCCCAAATTATTTACTGCTATTTTTGGTCTGTGTTCAAACA-TTTTCAGTATT 3803 

OV 3932 GATAGCTTACATATGGCCAAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCT 3991 

M | | | M II I I I I I M I I I II I II II N I 

Db 3804 GATAATGTGCA-ACAGCCAAAGGAACACTGTTTTCATCCAAATGCGGGTGTGTTGTACCT 3862 

Qy 3992 AACTTTATAAAAGTGTAATATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGG 4051 

|| I M I M I II II I I Nil I INI 

Db 3863 AAC— ATGCACTTGTAATAAAGCCGTGTAAAA— TAACTGTGTTTTGTTTTGCTCTGG 3916 

QV 4052 TTGCCTAAAGTGGC TATAGTTACTGATTTTTTATTATGTAAGCAAAACCAATAA 4105 

| | II II I I II I II I I I I I II II II II II II N I II II 

Db 3917 TCACCTAAAGTGGCAGCTTGTGTCGTTGCTAACTTCTTGTTGAGTAAGCAAAACCAATAA 3976 

Qy 4106 AAATTTAAGTTTTT 4119 

I NNI II 
Db 3977 ACGTTCAAATGGTT 3990 
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AY415512 

LOCUS AY415512 1329 bp DNA linear GSS 17-DEC-2003 

DEFINITION Homo sapiens EDNRB gene, VIRTUAL TRANSCRIPT, partial sequence, 
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VERSION AY415512.1 GI : 39771471 
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Location/Qualifiers 

1. .1329 

/organism= M Homo sapiens" 
/mol_type="genomic DNA" 
/db_xref="taxon:9606" 
<1. ,>1329 
/gene="EDNRB" 
/locus_tag="HCM5582 " 



ORIGIN 



Query Match 30.9%; Score 1329; DB 29; Length 1329; 

Best Local Similarity 100.0%; Pred. No. 2.2e-245; 

Matches 1329; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 


238 


ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 

| | | | I I M 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 M M II II 

ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 


297 


Db 


1 


60 


Qy 


298 


CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 

| I I I M 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 MINIM 

CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 


357 


Db 


61 


120 


Qy 


358 


CAAACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCC 

1 M 1 1 II 1 1 M 1 1 1 1 II 1 1 1 II 1 II 1 M M 1 1 M M 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 1 

CAAAC C GCAGAGAT AAT GAC GC C ACC C ACT AAGAC CT T AT GG C C CAAG GGT T C CAAC GC C 


417 


Db 


121 


180 


Qy 


418 


AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 

| | | | | | | | I I M 1 M 1 1 II 1 1 1 II 1 1 1 1 1 1 M M 1 1 M 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 

AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 


477 


Db 


181 


240 


Qy 


478 


C C G C CAC GC AC CAT CTCCCCTCCCCCGT GC CAAGGACC CAT C GAGAT CAAG GAGACT T T C 

M | | M 1 1 II 1 1 1 1 1 1 M 1 II 1 II 1 M M 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 M 1 M 1 1 1 1 1 1 1 

C C GC CAC GC AC CAT CTCCCCTCCCCCGT GC CAAG GAC C CAT C GAGAT CAAG GAGACT T T C 


537 


Db 


241 


300 


Qy 


538 


AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 

| | | | | | | | || || I I II 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 M II 1 1 M 1 M 1 1 M 1 

AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 


597 


Db 


301 


360 



598 AC ACT T CT GAGAAT TAT C T ACAAGAACAAGT GC AT GCGAAACGGT C C CAAT AT CT T GAT C 657 

|| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

361 ACACT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C 420 

658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

| I | | | I I I I I I I II M I I I I I I I I I I I I I M I I I I I M II M I 

421 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 480 
718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

| | | M | | | I I I I I I I I I M I II II I II I I I I I I I I I I I M II I M I I I M 

481 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 540 

778 CAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATAT 837 

| | | M | | | | I I I I I I I I I M I I I I I I I M I I I I I M I I I I I I I M I I M I I I I I I II I I I 
541 CAGAAAGCCT CCGT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGT ATT GACAGAT AT 600 

838 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 8 97 

| | | | | M | | | | M I II I I I I I I I I M M I I M I I I I I I I I I I I M I I 

601 C GAGCT GTTGCTTCTT GGAGT AGAAT T AAAGGAAT T GGG GT T C CAAAAT GGAC AG CAGT A 660 

898 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

| | || | M | | | | M I I I I M I I I I I I I I I I M I II I I I M I I I I I I I I I I 

661 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 720 

958 GAT ATAAT T AC GAT GGACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C CGT T 1017 

M I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

721 GAT ATAAT T AC GAT GGACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C CC GT T 780 

1018 C AGAAGAC AG C TT T CAT GCAGT T T T ACAAGAC AGCAAAAGATT G GT G GCT GT T CAGT T T C 1077 

I | | M I II I I I I I I I M I I I I I M I II I I I I I I I I I M I I I I I I I I I I I I I I I M 

781 C AGAAGACAG CTTT CAT GCAGT T T T ACAAGAC AGCAAAAGATT G GT GGCT GT T CAGT T T C 840 

1078 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 1137 

| | | | | I I I M M II M I I I I I I I I M I I I I M I I I I I I I M I I I I I II I 

841 TAT T T CT GC TT GCC AT T GGC CAT CACT GCAT T T T T T T AT ACACTAAT GAC CT GT GAAAT G 900 

1138 T T GAGAAAGAAAAGT GGCAT GCAGAT T GCT T T AAAT GAT C AC CTAAAGC AGAGAC GGGAA 1197 
| | | | | | | | | | | I I M I I I II I I I I M I I I I M I I I I I I I I I I M I I I II I M I I M I I M 
901 TT GAGAAAGAAAAGTGGCAT GCAGATT GCT TTAAAT GAT CACCTAAAGCAGAGAC GGGAA 960 

1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1257 

| | | | | | | | | | | | | I I I I I I II I I I I I II I I I II I I I I M I I I I I I I M I I I I I I I I I I I I 
961 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1020 

1258 CT C AGC AG GAT T CT GAAGCT CACT CT TT AT AAT C AGAAT GAT C C CAAT AGAT GT GAACTT 1317 

| I M I I I I I I I M II I I I I M I I I I I I I I II I I I I I M I I M I I II I I I I I I I M I M I I 

1021 CT C AGC AGGAT T CT GAAGCT CAC T CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T 108 0 

1318 TT GAGCT TTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTT CACT GAATTCCTGC 1377 

| | | | | | | | M I I I I I I I I M I I I M M II I I I II M I I I I I I I I I I I I M I II II I I I I I 
1081 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTT CACT GAATTCCTGC 1140 

1378 ATTAACCCAATTGCTCTGTATTTGGTGAGCT^AAGATTCAAAAACTGCTTTAAGTCATGC 1437 

| | | M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M 

1141 ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 1200 



Qy 

Db 

Qy 

Db 

Qy 

Db 



1438 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 

IN || I I I I I I I I II I I I I I I I I I IN I I I I I I I 

1201 T TAT GCTGCTGGTGC C AGT CAT T T GAAGAAAAAC AGT C CT T GGAGGAAAAGC AGT C GT G C 



1498 



T T AAAGT T CAAAGCT AAT GAT CAC G GAT AT GAC AACTT C C GT T C C AGT AAT AAAT AC AGC 

| | 1 | I I I I I I I I I I I I I I II I II I I I II I I I I I I I I I I I I I 

1261 T T AAAGT T C AAAG C T AAT GAT CAC G GAT AT GAC AAC T T C C GT T C CAGT AAT AAAT AC AGC 



1566 



1497 



1260 



1557 



1320 



1558 TCATCTTGA 
I I I I I I I II 
1321 TCATCTTGA 1329 



RESULT 4 
AK082103 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus , 



AK082103 2521 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 0 day neonate cerebellum cDNA, RIKEN full-length 
enriched library, clone : C230007M01 product : ENDOTHELIN B RECEPTOR 
PRECURSOR, full insert sequence. 
AK082103 

AK082103. 1 GI: 2 634 9538 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
1 

Carninci,P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K. , Itoh,M. 
Konno , H . , Akiyama , J 



Aizawa, K. , 
, Nishi, K. , 
Nakamura, S . , 



Nagaoka,S., Sasaki, N., Carninci,P. 
Kitsunai,T., Tashiro,H., Itoh,M., 
Sumi,N., Ishii,Y., Nakamura, S., Hazama,M. , Nishine,T., Harada,A., 
Yamamoto,R., Matsumoto, H . , Sakaguchi, S . , Ikegami,T., Kashiwagi , K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J. 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 



JOURNAL Nature 409, 685-690 (2001) 
REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 2521) 

AUTHORS Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci,P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto, K. , Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M., 
Nakamura,M., Nishi,K., Nomura, K. , Numazaki, R. , Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A. , Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki , Y. 

TITLE Direct Submission 

JOURNAL Submitted ( 16-APR-2002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN), Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res @gsc. riken . go . jp, 
URL :http: //genome. gsc. riken. go. jp/, Tel : 8 1-45-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / /genome . gs c . ri ken . go . j p/ 
URL : ht tp : / / f antom. gs c . riken . go . j p/ . 
FEATURES Location/Qualifiers 
source 1. .2521 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/ db_xr e f = " FANTOM_DB : C2 3 0 0 0 7M0 1 " 

/ db_x r e f = "MGI : 2 4 1 5 2 9 1 " 

/db_xref= M taxon: 10090" 

/clone="C230007M01" 

/ tissue__type=" cerebellum" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="0 day neonate" 
CDS 213. .1541 

/note="unnamed protein product; ENDOTHELIN B RECEPTOR 

PRECURSOR (SWISSPROTI P48302, evidence: FASTY, 100%ID, 

100%length, match=1326) 

putative" 

/codon_start=l 

/protein__id="BAC384 09.1" 

/db_xref="GI: 26349539" 

/ translation="MQSPASRCGRALVALLLACGFLGVWGEKRGFPPAQATLSLLGTK 



EVMTPPTKTSWTRGSNSSLMRSSAPAEVTKGGRGAGVPPRSFPPPCQRNIEISKTFKY 
INTIVSCLVFVLGI IGNSTLLRI I YKNKCMRNGPNI LIASLALGDLLHI 1 1 DI PINTY 
KLLAEDWP FGAEMCKLVP FI QKAS VGI TVLS LCALS I DRYRAVASWS RI KG I GVPKWT 
AVE I VL I WWSWLAVP EAI G FDMI T S D YKGK P LRVCMLN P FQ KT AFMQ F YKT AKDWW 
LFSFYFCLPLAITAVFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYDQSNPHRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 



ORIGIN 



Query Match 26.4%; Score 1137.6; DB 11; Length 2521; 

Best Local Similarity 76.4%; Pred. No. 1.2e-208; 

Matches 1507; Conservative 0; Mismatches 439; Indels 26; Gaps 8; 

AAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCA 252 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II 
AAACAGCAGAGCGGCTACCAGACTCTCACAGGAGCAAGCTGTAACATGCAATCGCCCGCA 227 

AGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGG 312 

II I I I I I I I I I I II I I I I I I HIM I M I I I I I Ml I I I M I Ml 
AGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGCTTCTTGGGGGTATGG 287 

G GAGAG GAGAGAGGCT T C CC GCCT GAC AGG GC CACT C CGCTTTTGCAAACCGCAGAG 369 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

GGAGAGAAAAGAGGAT T C CC ACCT G C C CAAGC CAC GCT GT CACT T CT C G GGACTAAAGAG 347 

ATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGG 429 

I I I I 1 I I I I I I I I I I I I I I I I I I I III Ml I I I M I I I I I I I I I I I I 111 
GTAATGACGCCACCCACTAAGACCTCCTGGACCAGAGGTTCCAACTCCAGTCTGATGCGT 407 

TCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACC 489 

Ml I I I I I I I I M I I I I I I I M I I I I III I I I I I I I I I I I I I I 
TCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGAGTCCCGCCAAGATC- 4 66 

AT CTCCCCTCC C CC GT GC CAAGGAC C CAT C GAGAT CAAG GAGAC T T T CAAAT ACAT CAAC 549 

II MINI I I I I I I I I I II II I I I I I I I I I II I I I I I I I I I I M I I I 

— CT T C C CT C CT CC GT GC CAAC GAAAT AT T GAGAT CAGCAAGACT T T T AAAT ACAT CAAC 524 

ACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGA 609 

III I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I II II Ml 
ACGATTGTGTCGTGCCTCGTGTTCGTGCTAGGCATCATCGGGAACTCCACGCTGCTAAGA 584 

ATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCT 669 

II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I 
ATCATCTACAAGAACAAGTGCATGCGCAATGGTCCCAATATCTTGATCGCCAGTCTGGCT 644 

CT G GGAGAC CT GCT GCACAT C GT CAT T GACAT C C C TAT CAAT GT CT ACAAGC T GCT GG CA 729 
I I I I I I I I I I I I II I I I I I I I I I I I I M I II II II I I I I I I I I I I I Ml 
C T GG GAG AC CT ACT G CACAT CAT C AT AGACAT AC C CAT T AAC AC C T ACAAGT T GCT C GC A 704 

GAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTCC 789 

I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I M M 
GAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCCTTCATACAGAAGGCTTCT 764 

GTGGGAAT CACT GT GCT GAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGTT GCT 849 

I I I I I I I I I I I I I I I I I I I I I I I 1 I 1 I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTGGGAAT CAC AGTGCTGAGTCTTTGT GCT CTAAGT ATT GACAGAT AT CGAGCTGTT GCT 824 



Qy 


193 


Db 


168 


Qy 


253 


Db 


228 


Qy 


313 


Db 


288 


Qy 


370 


Db 


348 


Qy 


430 


Db 


408 


Qy 


490 


Db 


467 


Qy 


550 


Db 


525 


Qy 


610 


Db 


585 


Qy 


670 


Db 


645 


Qy 


730 


Db 


705 


Qy 


790 


Db 


765 



Qy 


850 


TCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTG 

1 1 1 I I | | I I 1 1 1 1 1 1 II 1 1 1 1 M 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

T CT T GGAGT C GAAT T AAAGGAAT T GG GGTT C C AAAAT GGAC AGC AGT AGAAAT T GTT T T A 


909 


Db 


825 


884 


Qy 


910 


ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 

HIM I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 

ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGTTTTGATATGATTACG 


969 


Db 


885 


944 


Qy 


970 


AT GGACT ACAAAGGAAGT T AT CT GC GAAT CTGCTTGCTT CAT C C C GT T CAGAAGACAGCT 

MIIMMIIIII MM Mill 1 Ml 

TCGGACTACAAAGGAAAGCCCCTAAGGGTCTGCATGCTTAATCCCTTTCAGAAAACAGCC 


1029 


Db 


945 


1004 


Qy 


1030 


TTCATGCAGTTTTACT^AGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTG 

M 1 1 M M II 1 M 1 1 1 II 1 M 1 1 1 1 II 1 M II 1 II 1 M 1 1 1 1 1 1 1 1 1 1 M 1 

TTCATGCAGTTTTACAAGACAGCCAAAGATTGGTGGCTGTTCAGTTTCTACTTCTGCTTG 


1089 


Db 


1005 


1064 


Qy 


1090 


C CAT T GGC C AT CACT G CAT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T GAGAAAGAAA 

|| | 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 M 1 1 II 1 II 1 1 1 1 1 MINI 1 M MIM 

C C GCT AG C CAT CACT G C AGT CT TT TAT AC C CT GAT GAC CT G C GAAAT GCT CAG GAAGAAG 


1149 


Db 


1065 


1124 


Qy 


1150 


AGT GG CAT GCAGAT T G CT T T AAAT GAT C AC CT AAAGC AGAGAC GGGAAGT GGC CAAAAC C 

M || | | || 1 1 1 1 1 1 1 M 1 MINIMI 1 1 1 MM II II II 1 M 1 1 II 1 1 1 1 II 

AG C GGT AT GCAGAT T G CT T T GAAT GAT C ACT T AAAGCAGAGAC GAGAAGT G GC CAAGAC A 


1209 


Db 


1125 


1184 


Qy 


1210 


GTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATT 

MMI 1 1 II 1 II M II II Mill 1 1 1 1 1 II II II 1 M 1 II 1 II II M II 1 1 II 

GTCTTCTGCCTGGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTTCACCTCAGCCGGATC 


1269 


Db 


1185 


1244 


Qy 


1270 


CT GAAGCT CACT CTTT ATAAT CAGAAT GAT CCCAATAGAT GT GAACTTTT GAGCTTT CT G 

1 | M | | I I II 1 1 1 1 M 1 II M II II 1 M 1 M 1 M II II 1 M 

CT GAAGCT C AC C CT GT AT GAC C AGAGCAAT CC AC ACAGGT GT GAG CT T CT GAGCT T T T T G 


1329 


Db 


1245 


1304 


Qy 


1330 


TTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCAATT 

MIM llllllll IMMMIMI MM MM Mill Ml 

TTGGTTTTGGACTACATTGGTATCAACATGGCTTCTTTGAACTCCTGCATCAATCCAATC 


1389 


Db 


1305 


1364 


Qy 


1390 


GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGG 

| || I II I | M 1 1 1 1 II II M II 1 M IIIIMM 1 M IIIMIMI 

GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGTTTGTGCTGCTGG 


1449 


Db 


1365 


1424 


Qy 


1450 


TGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAA 

I | I I I | MMI 1 II II 1 1 M 1 M II II 1 II 1 M II 1 II 1 M 1 IIIMIMI 

T GC CAAAC GT T T GAGGAAAAGCAGT C CT T G GAGG AGAAG C AGT C CT G C CT GAAGT T C AAA 


1509 


Db 


1425 


1484 


Qy 


1510 


GCT AAT GAT CAC GGAT AT GACAACT T C C GT T C C AGTAAT AAAT AC AGCT C AT CT T GAAAG 

|| M 1 1 1 1 II 1 1 1 II 1 II MM 1 II 1 II 1 M M 1 

GCCAACGATCACGGATATGACAACTTCCGGTCCAGCAATAAATACAGCTCGTCTTGAAGG 


1569 


Db 


1485 


1544 


Qy 


1570 


AAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAAAACAAAATG 

| | | I I II 1 II 1 1 1 1 1 M 1 M 1 1 1 1 M M 1 1 1 M II 

CAAGAACACT CGC CGAAT CTCACT GT CCT CAT T GT GGACAGATACCATTAAAACAAAAT G 


1629 


Db 


1545 


1604 


Qy 


1630 


AAAC AT TT G C CAAAACAAAAC AAAAAACT AT GT AT T T GCAC AGCACACT AT TAAAAT AT T 

1 1 1 1 I I II 1 1 II II 1 II II II 1 1 1 1 1 1 1 M 
J^AACCGTTGCCAAATCAAAATGGAAAAAACCATGCTAGCAGAAAGGTGTGCGCGCGTGTG 


1689 


Db 


1605 


1664 


Qy 


1690 


AAGT GT AAT TAT T TT AAC ACT CACAGCT ACAT AT GAC AT T T TAT GAGC T GT T T AC 


1744 



I III II 1 11 I I 1 

Db 1665 AGAG G GAT TAT T T T T AACT GT T CT GAC G CT C AAC AC CG GAT AT AT T CAC GG G CT GTTT AC 1724 

Qv 1745 G GCAT G GAAAGAAAAT CAGT G GGAAT T AAGAAAG CCTCGTCGT GAAAGCACT T AATT T T T 1804 

|| || | I I I I I I I I I I I I I M 

Db 1725 AACCTAAGAAAGCTGTGGGAAGGAATGAAGCCCTCCTCCGTGGGGAAGCACTTAGATTCT 1784 

Qy 1805 TACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACTTAGG 1864 

| M Mill II I IN I I I I I I I 

Db 1785 T— AGTCAGCACTTCAGCAGAGCTCTTAAAAGCCCCTAGTGCGTTCACATGCCACTTACG 18 42 

Qy 1865 CT TAAAAAT GAGCT C ACT CAGAATT T CT AT T C T T T C T AAAAAGAGAT T TAT T T T TAAAT C 1924 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1843 TTTAAAAAAACG AGAACT T C ACT GAAGT T C T GTT C AGGAGT T T ATT AT C CAGT 1895 

Qy 1925 AATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATGAAGC 1984 

I Ill I Ml I N 

Db 18 96 CCT AT GAAT CT GGATT CAAGAAAGCAT GA CAT T G C AAAAC AAT T CT T AAAAC GAAGT 1952 

Qy 1985 T TAAAT TACT CAAT TT AAAAT T T T AAAAT C CT TT AAAACAACT T T T CAAT T AAT AT TAT C 2044 

II MM II I I II MM I I I IN II Ml MM 

Db 1953 TTCAATTGCTTAATTTGAAACTTT^AAAAAAAAAAACTAATAAATTTTTATGCATACTATC 2012 

Qy 2045 — AC AC TAT TAT C AGAT T GT AAT T AGAT GC AAAT GAG AGAGC AGT T T AGT T GT T GCA- T T 2101 

| | | | || I I M II I II I I I I M I I I I I I I II I I M II I III 
Db 2013 AT AC C C ACT AAT CT GAT T GT AACT AT AT GCAAAAGAAAAGGCAAT AT GGT T GGT AAACT T 2072 

Qy 2102 TTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCA 2153 

| | | || M I I II II I I I I II I I I I I I I I I I I I 11 

Db 2073 TTTTGGTCATTACCAACATTGAAATGATCAGAATTCGGGGGAAGAAAAGACA 2124 



RESULT 5 
AK085165 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 



AK085165 3611 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 13 days embryo lung cDNA, RIKEN full-length enriched 
library, clone : D430047G06 product : ENDOTHELIN B RECEPTOR PRECURSOR, 
full insert sequence. 
AK085165 

AK085165.1 GI: 26351484 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 
Genome Res. 10 (10), 1617-1630 (2000) 



MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



COMMENT 



20499374 
11042159 
3 

Shibata,K., Itoh,M., Aizawa,K., 
Konno,H., Akiyama, J. , Nishi,K., 
Sumi, N 



Carninci, P . , 

Itoh,M. , 
Harada, A. , 



FEATURES 

source 



Nagaoka, S . , Sasaki, N. , 
Kitsunai, T. , Tashiro, H. 
Ishii,Y., Nakamura, S- , Hazama,M. , Nishine,T. 
Yamamoto,R., Mat sumo to, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake, S. , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
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1. .3611 



/organism="Mus musculus" 
/mol_type="mRNA" 
/strain= H C57BL/6J" 
/db_xref="FANTOM_DB:D430047G06" 
/db_xref="MGI : 2422091" 
/db_xref="taxon: 10090" 
/clone-"D430047G06" 
/tissue_type= f, lung" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="13 days embryo" 
CDS 217. .1155 

/note="unnamed protein product; ENDOTHELIN B RECEPTOR 

PRECURSOR (SWISSPROT | P48302, evidence: FASTY, 100%ID, 

100%length, match=1326) 

putative" 

/ codon_start=l 

/protein_id="BAC39379.1" 

/db_xref="GI: 26351485" 

/translation="MQSPASRCGRALVALLLACGFLGVWGEKRGFPPAQATLSLLGTK 
EVMTPPTKTSWTRGSNSSLMRSSAP7VEVTKGGRGAGVPPRSFPPPCQRNIEISKTFKY 
INTIVSCLVFVLGIIGNSTLLRIIYPCNKCMRNGPNILIASLALGDLLHIIIDIPINTY 
KLLAEDWP FGAEMCKLVP FIQKAS VGI TVLS LCALS I DRYRAVASWS RI KGI GVPKWT 
AVEIVLIWWSWLAVPEAIGFDMITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWW 
LFSFYFWLAASHHCSLLYPDDLRNAQEEERYADCFE" 

ORIGIN 

Query Match 26.2%; Score 1126.4; DB 11; Length 3611; 

Best Local Similarity 64.4%; Pred. No. 1.6e-206; 

Matches 2333; Conservative 0; Mismatches 952; Indels 336; Gaps 30; 

Qy 193 AAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCA 252 

I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II 
Db 172 AAAC AGC AGAG C GGCT AC C AGACT CT C AC AGGAG CAAGC T GTAAC AT GCAAT C GC C C G C A 231 

Qy 253 AGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGG 312 

I | I I I I I II I I II I I I I I II I I I I I I I M I I I I I Ml I I I II I Ml 
Db 232 AGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGCTTCTTGGGGGTATGG 291 

Qy 313 GGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTC CGCTTTTGCAAACCGCAGAG 369 

I || I II I I I II I I I I M I II I I I I I I I I I I I I I M I I II 

Db 292 GGAGAGAAAAGAGGATT C C C AC CT G C C CAAGC CAC GCT GT CAC T T CT C G GGAC T AAAGAG 351 

Qy 37 0 ATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGG 429 

I I I I I I I I II I I I I I I I I I I I I I I III Ml I I I I I I I I I II I M I I I Ml 
Db 352 GT AAT GAC G C CAC C CAC T AAGAC CT C CT GGAC CAGAG GT T C CAACT C C AGT CT GAT GC GT 411 

Qy 430 TCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACC 489 

Ml I I I I II I I I I I M II I I I I M I I Ml I I I I I I II I I I I I I 
Db 412 TCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGAGTCCCGCCAAGATC- 47 0 

Qy 4 90 AT CTCCCCTCCCCCGT GC CAAG GACC C AT C GAGAT CAAG GAGACT T T CAAAT ACAT CAAC 54 9 

M I I I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I II I 

Db 471 — CTTCCCTCCTCCGTGCCAACGAAATATTGAGATCAGCAAGACTTTTAAATACATCAAC 52 8 

Qy 550 ACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGA 609 

III I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I II II Ml 
Db 52 9 ACGATTGTGTCGTGCCTCGTGTTCGTGCTAGGTATCATCGGGAACTCCACGCTGCTAAGA 588 



610 AT TAT CT ACAAGAACAAGT GCAT G C GAAAC G GT C C CAAT AT C T T GAT C GCC AGCT T G GCT 669 

M I I I I M I I I I I I I II II I I I I I I I I I I I I M I I I I I I I I I I I I 

589 AT CAT CT ACAAGAACAAGT GCAT GCGCAAT GGT CCCAATAT CTT GAT C GCCAGT CT GGCT 648 

67 0 CTGGGAGACCT GCT GCACATCGTCATTGACATCCCTAT CAAT GTCTACAAGCT GCT GGCA 72 9 

| | M I I I I I I I I I I I I I I I I I I I I M M I II M II I I I I I I I I I I I IN 
649 CT GGGAGACCTACT GCACAT CAT CATAGACAT ACCCATTAACAC CT ACAAGTT GCT CGCA 708 

730 GAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTCC 7 89 

| | | | M I I I I I I II I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I M M 
7 09 GAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCCTTCATACAGAAGGCTTCT 768 

7 90 GT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGTATT GACAGAT AT CGAGCT GTTGCT 849 

| I I I I I I I I I I I I I I I I M I I I M I I I I II I I I 1 II I I M I I I I I I I I I I I I I I I M 
769 GT GGGAAT CACAGT GCT GAGT CT TTGT GCT CTAAGTATTGACAGATAT CGAGCT GTTGCT 828 

850 TCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTG 909 

I I ! I I I I II I II I M I I I M I I I I M I I I I I I I M MM I 

82 9 T CT T G GAGT C GAAT T AAAGGAAT T G GG GT T C C AAAAT GGACAG C AGT AGAAAT T GT TT T A 8 88 

910 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 969 

| M | || I I I I I I II I II I I I I I M I II I I I I I I M I I M I I M I M II II I I I I I I I I 
889 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGTTTTGATATGATTACG 948 

970 ATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAAGACAGCT 1029 

I I I I I I M M I II I M I M I I I II II I II I II II II I M M I II 

94 9 TCGGACTACAAAGGAAAGCCCCTAAGGGTCTGCATGCTTAATCCCTTTCAGAAAACAGCC 1008 

1030 TT CAT GCAGTTTTACAAGACAGCAAAAGATT GGT GGCT GTTCAGTTTCTATTTCT-GCTT 1088 

| | | I I I I I I I I I II I II I I I M I II I I I I I I M II I I II I I I I I I I M I I I M I I I I 
1009 TT CAT GCAGTTTTACAAGACAGCCAAAGATT GGT GGCT GTTCAGTTTCTACTTCTGGCTT 1068 

108 9 G C CAT T GGC C AT CACT GCAT T T T T T TAT AC AC T AAT GAC CT GT GAAAT GT T GAGAAAGAA 114 8 

Ml I I II I II I I I I I I I M II I I II M I I I I I M I I M I II II 

1069 GCC G CT AG C CAT CACT GCAGT CT T T TAT AC C CT GAT GAC CT GC GAAAT GCT C AGGAAGAA 1128 

1149 AAGT G GC AT G C AGAT T G CTT TAAAT GAT C AC CTAAAG C AGAGAC GGGAAGT GGC CAAAAC 1208 

|| || | I 1 I t I I 1 I I I I I 1 II I I I I II I I I I I I II I II M I II II I II I II I II 
1129 GAGC GGT AT GC AGAT T G CTT T GAAT GAT CACTTAAAGC AGAGAC GAGAAGT GGC CAAGAC 1188 

12 09 CGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGAT 12 68 

| | I I I | | | I I M II I I II I II I I INN M I I I I II II II II II II M I I I II 

1189 AGTCTTCTGCCTGGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTTCACCTCAGCCGGAT 1248 

1269 T CT GAAGCT CA- CT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T T T GAGCT T T C 1327 

I | || I || I I I I M II I I I II I I I I M II I II I M 

1249 CCTGAAGCTCAGCCCTGTATGACCAGAGCAATCCACACAGGTGTGAGCTTCTGAGCTTTT 1308 

1328 T GT T GGT AT T GGACT AT AT T G GT AT CAAC AT G GC T T CACT GAAT T CCT GCAT TAACC C AA 1387 

M I I II I I II II II I I I I I I I II I I I I I II I I M I I M I M M II 

1309 TGTTGGTTTTGGACTACATTGGTATCAACATGGCTTCTTTGAACTCCTGCATCAATCCAA 1368 

1388 TTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCT 1447 

I | I I I I I I I I II II II I I II II II II M I M M I I I I I M II II M I I I M 

1369 TCGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGTTTGTGCTGCT 1428 



Qy 


1448 


GGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCA 

| | | I | 1 1 I 1 II 1 1 1 M II 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 

GGTGCCAAACGTTTGAGGAAAAGCAGTCCTTGGAGGAGAAGCAGTCCTGCCTGAAGTTCA 


1507 


Db 


1429 


1488 


Qy 


1508 


AAGCT AAT GAT CAC G GAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AG CT CAT CT T GAA 

MM 1 1 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 MM Mill 1 1 1 1 1 II 

AAGCCAACGATCACGGATATGACAACTTCCGGTCCAGCAATAAATACAGCTCGTCTTGAA 


1567 


Db 


1489 


1548 


Qy 


1568 


AGAAGAACT AT T CACT GT AT T T CAT T T T CT T TAT AT T GGAC CGAAGT C ATT AAAACAAAA 

|| | | II 1 1 II III 1 II 1 M 1 1 II 1 M 1 1 1 II 1 1 1 II 1 1 1 

GG CAAGAAC ACT C G C C GAAT C T CACT GT C C T CAT T GT GGAC AG AT AC C ATT AAAACAAAA 


1627 


Db 


1549 


1608 


Qy 


1628 


T GAAAC AT T T G C CAAAACAAAACAAAAAACT AT GTAT T T G C AC AGC AC ACT AT T AAAAT A 

1 1 1 1 1 1 1 II 1 1 M 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

T GAAAC C GT T GC CAAAT CAAAAT GGAAAAAAC CAT GCT AGC AGAAAG GT GT GC GC GC GT G 


1687 


Db 


1609 


1668 


Qy 


1688 


TTAAGT GT AAT TAT T T TAAC AC T CAC AG CT AC AT AT GAC ATTTTATGAGCTGTTT 

II I 1 1 II II 1 II 1 1 1 1 II 1 1 1 II II 1 1 
T GAGAGGGAT T ATT T T T AACT GT T CT GAC GCT CAACAC C G GAT AT AT T CAC GG GCT GT T T 


1742 


Db 


1669 


1728 


Qy 


1743 


AC GG C AT GGAAAGAAAAT CAGT GGGAAT T AAGAAAGCCT C GT C GT GAAAGC ACTT AAT T T 

|| M II 1 1 1 1 1 1 1 1 II 1 1 1 1 Mil II M 

ACAAC C T AAGAAAGCT GT GGGAAG GAAT GAAGC C CT C CT C C GT GGG GAAGC ACT T AGAT T 


1802 


Db 


1729 


1788 


Qy 


1803 


TT TACAGTTAGCACTT CAAC AT AGCT CTTAACAACTT C CAGGATATT C ACACAACACTTA 

M II 1 1 II 1 1 II M 1 1 1 1 1 1 1 1 M 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 

CTT— AGTCAGCACTTCAGCAGAGCTCTTAAAAGCCCCTAGTGCGTTCACATGCCACTTA 


1862 


Db 


1789 


1846 


Qy 


1863 


GGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTAAA 

| I I I 1 1 1 1 1 1 1 II 1 II 1 1 1 1 II II 1 1 
CGTTTAAAAA AAC GAGAACT T CACT GAAGTT CT GT T C AGGAGT T TAT TAT CCA 


1922 


Db 


1847 


1899 


Qy 


1923 


T CAAT GGGACT CT GAT ATAAAGGAAGAAT AAGT CACT GT AAAAC AGAACT T T T AAAT GAA 

1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II M M 1 1 II Ml II 1 

GT C CT AT GAAT CT G GAT T CAAGAAAGC AT - - GAC AT T G CAAAACAAT T CT T AAAAC GAAG 


1982 


Db 


1900 


1957 


Qy 


1983 


GCTTAAATTACTCAATTTAA7^ATTTTAAAATCCTTTAAAACAACTTTTCAATTAATATTA 

| I I I I I 1 1 II 1 II MM 1 1 1 M 1 II 1 1 1 1 1 
TTTCAATTGCTTAATTTGAAACTTAAAAAAAAAAAAACTAATT^ATTTTTATGCATACTA 


2042 


Db 


1958 


2017 


Qy 


2043 


TC— ACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCA- 

M Ml 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
TCATACCCACTAATCTGATTGTAACTATATGCAAAAGAAAAGGCAATATGGTTGGTAAAC 


2099 


Db 


2018 


2077 


Qy 


2100 


TTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTG 

|| || | M 1 1 1 1 1 1 1 M 1 II 1 II 1 1 1 1 1 1 1 1 M 1 II 
TTTTTTGGT CAT T AC CAAC AT T G AAAT GAT C AGAAT T C G G G G GAAGAAAAGAC AG 


2159 


Db 


2078 


2132 


Qy 


2160 


T T T T T GAAAAT CAT T ACACT TT C ACT AGAAGC C CAAAC CT C AGC AT T CT GCAAT AT GT AA 


2219 


Db 


2133 




2132 




Qy 


2220 


C CAAC AT GT CACAAACAAGCAG CAT GTAACAGACTGGC ACAT GT GCCAGCT GAAT TT AAA 
1 1 1 1 1 1 IN M 


2279 


Db 


2133 


2168 


Qy 


2280 AT AT AAT ACT T T T AAAAAGAAAAT TAT T ACAT CCT T T ACAT T C AGT T AAGAT CAAAC CT C 


2339 



2169 



234 0 ACAAAGAGAAAT AGAAT GT T T GAAAGGC T AT C C C AAAAGAC T T T T TT GAAT CT GT CAT T C 2399 

| | | | | | | I I III I I I I I I I I I I I I I > 

2215 AGAAAGACACAAA ACAGAACACT AC CT AT GAT T T CT TT AAAGT T CT T T CAA 2265 

2400 AC AT AC C CT GT GAAGACAAT AC TAT CT ACAAT T T T T T C AGGAT T ATT AAAAT CTTCTTTT 2459 

| | I I I I M M I I I I I I I I I I I I I 
2266 AT AT C CTT T CAT GAT T GAAGT T T AAAT T CC AT GT GTT C AACT T CAT C A 2313 

2460 TTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATAC 2519 

I I II I II I I I I M I I 

2314 TCTGTAAATACTTAGCTATTAGCTATAAGC 2343 

2520 ACT GCAT GT AGAT GAT T AAAT GA — GGG CAGG C C CT GT GCT C AT AGCT T T AC GAT GGAGA 2577 

M I I I I I I I I I I I 1 I I II I I I I I I I I M I I I I 

2344 ACTACACGTAGAGGACTTAACAAAGGGCAGGTCCCAGCGTTCGTAGCTTTCTGACAAAGA 24 03 

2578 GAT GCC AGT GAC CT C ATAAT - - AAAGACT GT GAACT GC CT G GT GCAGT GT C C AC AT GAC A 2635 

| M I I I II I I I I Ml I I I I I I I I I I M I I I I I I I I I M I I I I II I I I I 
2404 GAT GCCAGTAAC CCGGTTATAGACAGAAT GT GAATT GCC CGGT GCAGT GT CCACAT GGCA 2463 

2636 AAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGT 2695 

Ml I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I Ml 
2464 AAGAAGC AGGGAGC AT C — CT T T CAGC CAT GCT GT AGAGAAAAT G GT C C ACAGC 2515 

2 696 ATAAT G CT AT AGT TAAAAT ACT AT T T T T CAAAAT CAT AC AGAT T AGT - AC AT T TAACAG C 2754 

| M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2516 ACAAT AT GAT AGCGAAAATACC GT GGTTTAAC GC CATAGAAAATAGT CACT GTAACCAGC 2575 

2755 TACCTGTAAAGCTT ATT ACTAA-TTTTTGTATTATTTTTGT AAAT AGCCAATAGAAAAGT 2813 

| || I I I I I I II I I I I I I I I II M I II I II I I I I I I I M I I I I I 
2576 TCTCTCGGAGGCATACTACCAACTTTTTATGTTATTCCTGAAAATAGCCAATAGAAAGGC 2635 

2 814 TTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGA 287 3 

I M II I I I I I I I I I I I I I M I I N I I I I I I I I 

2636 GTTCTGGACATGGTGCTTTTTCTAAAACGTAGAAGCCAAACTGCTTCGGGGTCTGCAAGA 2695 

2 874 ACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGG 2 933 

| | | | | || II M I I I I I II I I I I I I I 

2696 TCCTCCTCTTTGCGCATTCTTGTCTAGGTTTTTTTTTTTTTTTTTTTAATCTCCTTTCCA 2755 

2934 AT AGCT T GGGAT GAGAT GT GT GT GAAAGT AT GT A 2967 

I I II III I I I I I 

2756 CNAACGTGCCCTTAGGGTTCAACTCCGGAATTGAAGCCGGTGGTGTNNGAAANGT^AATGC 2815 

2968 — CAAGAGAAAACGGAAGAGAGAGGAAATGAGGTGGGGTTGGAGG AAA 3013 

I I I I I I II I I I I I I M I I I I I I I I I I I I I I 

2 816 CCCAAGAGAAAAACTTAAAGAGAGAAGGAAATTTGAGGTTGGGGGCCCAGAAAGAAAAGC 2875 

3014 CCCATGGGG ACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCTCGTCACA 3067 

Ml | | U I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

287 6 CCCTTGGGGGAAAATAAATATTCCCATTCTTAGCCCTGTGTTCGTCACTGCCACGTCATG 2935 

3068 T CAAT GCAAAAGGT C CT GATTTT GTT C CAGC AAAACACAGT GCAAT GTT CT CAGAGT GAC 3127 
|| || I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I M I I I 



Db 



2 936 TCGGTGTGAAAGGTCCTGGTTCGGCTCCAGCAAAACAAAGCGCAGCGTTCTCAGCGTGAC 2 995 



Q y 3128 T T T C GAAAT AAAT T GGG C C C AAGAGC T T T AACT CG GT CT T AAAAT AT G C C CAAAT T 3183 

| | | I I I I I I I I I I I I II I I I I I I I I I I I I IN 

Db 2996 - T C G GGAACAAAC CAAG C C C GAGAG C T T T AACC TT GT CT TAAAAT AT AAC AGAT T T T C CT 3054 



Qy 



3184 TTT 3186 

I I 

Db 3055 TCCTTCCTTTTTCTCTTTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTCTTCTCT 3114 



Qy 3187 ACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATGTTGTTT 3246 

Ml I I I I I I I I I I I I I I I I 1 I I I I I I I I M I I I I I I M 

Db 3115 TCTCTTCTCTTCTTTTCATAACCCAGGCCACATGTTGAAAATGAGCTTAACAATGCAGTT 3174 

Qy 3247 T C T GT CAAT AT T GAAT GT GAT GGT AC AGT AAAC C AAAAC C CAACAAT GT GGC C AGAAAGA 3306 

| I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

Db 3175 TTCTACCAAAATCATTGTGACAATACAATA7\ACCCAAACGGGACAATGAGGTAAAAAACC 3234 

Qy 3307 AAG AGCAAT AAT AAT TAAT T CAC ACAC CAT AT GGAT T C TAT T TAT AAAT C AC C C ACAAAC 3366 

| | | | | M I I I I I I I I I M I I I I I II M I 

Db 3235 AAGAAC AAT AC T GAAT C CAC GT GAC AC ATGACTCTCTTTAGGAGTCACCCACAGTT 3290 

Qy 3367 T T GT T CT T TAAT T T CAT C C CAAT CAC T T TT T C AGAGGC CT GT TAT CAT AGAAGT C ATT T T 3426 

| | | 1 | II I I I I I I II I I I 

Db 3291 CTTGTGTGTA C AGAT T GCT T T T TAAT CAT AAAGGAC GCC C C 3331 

Qy 3427 AGACT CT CAAT T T T AAAT T AATT - T T GAAT CACT AAT AT TTT CAC AGT T TAT TAAT AT AT 3485 

Ml I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3332 AGAT CT T CAAT T T T AAGT T AGTT ATT GGCT C C C C AGT AGT T T CACAGC GT GGAT AT AT T T 3391 

Qy 348 6 T TAAT T T CT AT T T AAATT TT AGAT TAT TTT TAT T AC CAT GT ACT GAAT TTT T AC AT C CT G 3545 

I I I I I I I M I I I I II I I M I I I I I I I I I I II I I I I I I I I I I I I I 

Db 3392 TTAATTTTTACT-AAGTTTTAGATTGGTTTTATTGTTGTGTTCTAAATTCTTAAGTCCTA 3450 

Qy 3546 ATACCCTTTCCT TCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAA 3597 

| | | | | | I I I I I Mill I MM II I I I I II I 

Db 3451 ACATCTTTGTTTAACCCAGATGTTCCTTCCCTCTTCATGGGCAATAATCGTCCTGCCAAA 3510 

Q y 3598 T T T T GAAACT ACAC ACAAAAAGC AT ACT T GCAT T ATT TAT AAT AAAAT T GC AT T CAGT GG 3657 

I I I I I I II I I II I I II I I II I I 

Db 3511 TT AT GAAAT GG C AT AAGAAT AC TAT T C AC ATAAT ATAT ACAATAAAACT AT AT T AAGT GG 357 0 

Qy 3658 CTTTTTAAAAAAAATGTTTGA 3678 

I I I I I I I Ml I I I I 
Db 3571 CTTTTTTTATTAAAAATTTTA 3591 
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AK076426 2669 bp mRNA linear HTC 18-SEP-2003 

Mus mus cuius 0 day neonate head cDNA, RIKEN full-length enriched 
library, clone : 4832401B07 product : ENDOTHELIN B RECEPTOR PRECURSOR, 
full insert sequence. 
AK076426 

AK076426. 1 GI : 26345371 

HTC; CAP trapper. 

Mus musculus (house mouse) 
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Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 

Carninci,P. and Hayashizaki, Y - 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 
Genome Res. 10 (10), 1617-1630 (2000) 
20499374 
11042159 
3 

Shibata,K 



Itoh,M. 



Aizawa, K 
Konno,H., Akiyama,J., Nishi,K 
Sumi,N., Ishii,Y., Nakamura, S 



TITLE 
JOURNAL 



Nagaoka, S . , Sasaki, N. , Carninci, P. , 
Kitsunai,T., Tashiro,H., Itoh,M., 
Hazama,M., Nishine,T., Harada,A., 
Yamamoto,R., Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S,, Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 2669) 

Adachi,J., Aizawa, K. , Akimura,T., Arakawa,T., Bono,H., Carninci, P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A. , Murata,M., 
Nakamura, M. , Nishi,K., Nomura, K. , Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M. , Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 
Direct Submission 

Submitted ( 16-APR-2002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN), Laboratory for Genome 



COMMENT 



FEATURES 

source 



CDS 



Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL :http: //genome. gsc. riken. go. jp/, Tel : 81-45-503-9222 , 
Fax: 81-45-503-9216) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : //genome . gsc . riken . go . jp/ 
URL: http : //fantom. gsc. riken. go. jp/ . 

Location/Qualif iers 

1. .2669 

/organism="Mus mus cuius" 
/mol_type="mRNA M 
/strain="C57BL/6J" 
/db_xref= ,, FANTOM_DB:4832401B07" 
/db_xref="MGI : 2391324" 
/db_xref="taxon: 10090" 
/clone="4832401B07" 
/tissue_type="head" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="0 day neonate" 
220. .1548 

/note="unnamed protein product; ENDOTHELIN B RECEPTOR 

PRECURSOR (SWISSPROTI P48302, evidence: FASTY, 100%ID, 

100%length, match=1326) 

putative" 

/ codon_start=l 

/protein_id="BAC36337 . 1" 

/db_xref="GI: 26345372" 

/trans la tion="MQSPASRCGRALVALLLACGFLGVWGEKRGFPPAQATLSLLGTK 
EVMTPPTKTSWTRGSNSSLMRSSAPAEVTKGGRGAGVPPRSFPPPCQRNIEISKTFKY 
INTIVSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIIIDIPINTY 
KLLAEDWP FGAEMCKLVP FI QKAS VGI TVLS LCALS I DRYRAVASWS RI KGI GVPKWT 
AV^IVLIWWSWLAVPEAIGFDMITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWW 
LFSFYFCLPLAITAVFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYDQSNPHRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 



ORIGIN 



Query Match 23.7%; 
Best Local Similarity 84.7%; 
Matches 1169; Conservative 



Score 1020; DB 11 
Pred. No. 5.1e-186 
0; Mismatches 205 



Length 2669; 
Indels 6; Gaps 



2; 



QY 
Db 



193 AAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCA 252 

I I I I I I I I II I I I I I I I I I I I 1 I I I I I M I M I M 

175 AAACAGCAGAGCGGCTACCAGACTCTCACAGGAGCAAGCTGTAACATGCAATCGCCCGCA 234 



Qy 

Db 

Qy 



253 AGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGG 

II I I II Mill I M II II 

235 AGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGCTTCTTGGGGGTATGG 



313 G GAGAGGAGAGAGG CT T C C C GC CT GAC AGG G C C ACT C ■ 
I I I I II I I I I I I I I I I I MM I Mill I 



- — CGCTTTTGCAAACCGCAGAG 
I II I I II MM 



312 



294 



369 



2 95 GGAGAGAAAAGAGGATTCCCACCTGCCCAAGCCACGCTGTCACTTCTCGGGACTAAAGAG 354 
370 ATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGG 42 9 

Mill I Mill I MM MMIIM Ml 

355 GT AAT GAC GC C AC C CACT AAGAC C T C CT G GAC C AGAG GT T C CAACT C C AGT CT GAT G C GT 414 

4 30 TCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACC 489 

| | I II I II I I I M I I I M I IMIM II II I I I I 

415 TCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGAGTCCCGCCAAGATC- 473 

490 AT CTCCCCTCCCCCGT GC CAAGGACC CAT C GAGAT CAAG GAGACT T T CAAATAC AT C AAC 549 

|| Ml Ml II II I M I I M MIMM I 1 1 1 I I I I I I I I 

474 — CTTCCCTCCTCCGT GC CAACGAAAT AT T GAGAT C AG CAAGACT T T T AAATAC AT CAAC 531 

550 ACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGA 609 

| | Ml Mill M I M I M II II I I M II Ml 

532 ACGATTGTGTCGTGCCTCGTGTTCGTGCTAGGCATCATCGGGAACTCCACGCTGCTAAGA 591 
610 AT TAT CT ACAAGAACAAGT G CAT GC GAAAC GGT C CCAAT AT CT T GAT C GC C AGCT T GGCT 669 

M I II II II M II II I II M II II I I M I II I M II I M I II Mill 

592 AT CAT CT ACAAGAACAAGT G CAT GC G CAAT G GT CCCAAT AT CT T GAT CG C CAGT CT GGCT 651 
670 CTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCA 729 

MMM MMIIM II I I I I I M I I M II MM Ml 

652 CT G GGAGAC CT ACT GC ACAT CAT CAT AG AC AT AC C CAT T AAC AC CT ACAAGT T GCT CGCA 711 
730 GAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT ACAGAAAGCCTCC 789 

| | M II I II M II II M II II I II I II I I I I I I I I I I II I I I M M II II I I II II 

712 GAGGACT GG C C AT TTGGAGCT GAGAT GTGTAAGCT GGT GCCCTT CAT AC AGAAGGCTTCT 771 

790 GTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGTTGCT 849 

| || | | | | | M I I II I I I I II I I I I I II I I I I I II I I I I I I II II II II I II M M II 
772 GTGGGAATCACAGTGCTGAGTCTTTGTGCTCTAAGTATTGACAGATATCGAGCTGTTGCT 831 

850 TCTTGGAGTAGAATTAAAGGT^TTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTG 909 

| | | | | || I I II I II M I II I II II I I II M I I I II II M I II II II I I I II M I II II 

832 T CT T GGAGT C GAAT T AAAG GAATT GG GGT T C C AAAAT GGAC AGC AGT AGAAAT T GT T T T A 891 

910 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 969 

|| | | | M II I II I M II I II I I I II I I I M I I I I II I I II II II I II I II I I I II I II 
892 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGTTTTGATATGATTACG 951 

97 0 AT GGACT ACAAAGGAAGTT AT CT G C GAAT CT GCT T G CTT C AT C C C GT T CAGAAGAC AGCT 1029 

I | | M II II II II I M I I I I I I I I I M M I M I I II I I I I M II 

952 TCGGACTACAAAGGAAAGCCCCTAAGGGTCTGCATGCTTAATCCCTTTCAGAAAACAGCC 1011 

1030 TTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTG 1089 

|| I I I II II II II I I I II M I II II I I I I I II I II I I II II II I I I I I I I II M I I M 
1012 TT CAT GCAGTTTTACAAGACAGCCAAAGATT GGT GGCT GTTCAGTTTCTACTTCTGCTTG 1071 

1090 C CAT T GG C CAT CACT GCAT T T T T T T AT ACAC T AAT GAC CT GT GAAAT GT T GAGAAAGAAA 1149 

M I I M MIMM M I I I I I I I I I II I I I I I 

1072 C C GCT AGC CAT CACT GCAGT C T T T TAT AC C CT GAT GAC CT G C GAAAT GCT C AGGAAGAAG 1131 

1150 AGT GGC AT G CAGAT T G CTT TAAAT GAT C AC CT AAAGC AGAGAC GG GAAGT GGC CAAAAC C 12 09 

|| M II II M I II II I I I I I I I II I I I M I I II I I M II I I II II II I II I M 
1132 AGCGGT AT GCAGATT GCTTTGAAT GAT C ACTTAAAGCAGAGACGAGAAGT GGCCAAGACA 1191 



Qy 1210 GTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATT 12 69 

1 I I I I I I I I II I I I I I II I I II I Mill I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1192 GTCTTCTGCCTGGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTTCACCTCAGCCGGATC 1251 

Qy 127 0 CTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAGCTTTCTG 132 9 

I I I I I 1 I I M I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 1252 CTGAAGCTCACCCTGTATGACCAGAGCAATCCACACAGGTGTGAGCTTCTGAGCTTTTTG 1311 

Qy 1330 T T G GT AT T GG ACT AT AT T G GT AT CAAC AT G GC T T CAC T GAAT T C C T G CAT T AAC C CAAT T 138 9 

I I II I I I 1 I I I I I I I I I I II I I I M I I I I I I II 11 I I I I I I I I I I II I I I I I 

Db 1312 TTGGTTTTGGACTACATTGGTATCAACATGGCTTCTTTGAACTCCTGCATCAATCCAATC 1371 

Qy 1390 GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGG 1449 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

Db 1372 GCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGTTTGTGCTGCTGG 1431 

Qy 1450 TGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAA 1509 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I III I I I I I I I I I I 

Db 1432 T G C CAAAC GT T T GAGGAAAAGC AGT C CT T G GAGGAGAAG C AGT C CT GC CT GAAGTT CAAA 14 91 

Qy 1510 GCTAAT GAT C AC GGAT AT G ACAACT T C C GT T C CAGTAATAAAT AC AG CT CAT CT T GAAAG 1569 

II II I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I 

Db 1492 GC CAAC GAT CAC GGAT AT GACAACT T C C GGT C CAGCAAT AAAT AC AG CT C GT C T T GAAGG 1551 
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AY415514 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
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TITLE 
JOURNAL 
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AY415514 1329 bp DNA linear GSS 17-DEC-2003 

Mus musculus EDNRB gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY415514 

AY4 15514 . 1 GI: 39771473 
GSS. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 1329) 

Clark, A. G . , Glanowski , S . , Nielson, R. , Thomas, P., Ke j ariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello , D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J. J . , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp -mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 1329) 

Clark,A.G., Glanowski , S . , Nielson, R. , Thomas, P., Kej ariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello , D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 



FEATURES Location/Qualifiers 
source 1 . . 1329 

/organism= ft Mus musculus" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 10090" 
<1. .>1329 
/gene="EDNRB n 
/locus_tag="HCM5582" 



gene 



ORIGIN 

Query Match 23.2%; Score 996; DB 29; Length 1329; 

Best Local Similarity 85.3%; Pred. No. 2.5e-181; 

Matches 1136; Conservative 0; Mismatches 190; Indels 6; Gaps 2; 

ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 2 97 

I I I I I MM MM I M M M M M M I MM M M M I M Mill III 

ATGCAATCGCCCGCAAGCCGGTGCGGACGCGCCTTGGTGGCGCTGCTGCTGGCCTGTGGC 60 

CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTC CGCTT 354 

| | | || I I II I II I I I I I I I M I I M I I I I I I I I I M I I I I I 
TTCTTGGGGGTATGGGGAGAGAAAAGAGGATTCCCACCTGCCCAAGCCACGCTGTCACTT 120 

T T GCAAAC C GC AGAGATAAT GAC G C C AC CC ACTAAGAC CT T AT GGC C CAAGGGT T C CAAC 414 

I II I I I I II II I I II I I I I I I I I II I I I II I I M li I I II I M I I I 

CT C GGGACT AAAGAGGT AAT GAC GCC AC C C AC T AAGAC CT C CT G GAC C AGAGGT T CCAAC 180 

GCCAGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGA 47 4 

I M I I I I I II I II I M I I I I I I I I I I I I I I I I I I I M I I I 111 I t I 
TCCAGTCTGATGCGTTCCTCCGCACCTGCGGAGGTGACCAAAGGAGGGAGGGGGGCTGGA 240 

T CT C CGC CAC GC AC CAT CTCCCCTCCCCC GT G CCAAG GAC C CAT C GAGAT C AAG GAGACT 534 

II I M I I I I I I II I II I II I I I I I I II II I I II I M I I I M 

GTCCCGCCAAGATC CTTCCCTCCTCCGTGCCAACGAAATATTGAGATCAGCAAGACT 297 

TTCAAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAAC 594 

M I M II I II I I II II I I M I M I Mill I I I II II I I I I II II M II I II M I 
T T T AAAT AC AT CAACAC GAT T GT GT C GT GC CT C GT GT T C GT G CT AG GC AT CAT C GGGAAC 357 

T C C ACACT T CT GAGAAT TAT CT ACAAGAACAAGT GCAT G C GAAAC GGT C C CAAT AT CT T G 654 

Mill II II Mill I I I I I I I I I M I II II I I I I I I I II I I II I M I I I I I I M 

T C C AC GCT GC TAAGAAT CAT CT ACAAGAACAAGT GCAT GC GCAAT G GT C C CAAT AT CTT G 417 

ATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTC 714 

II || II II I I II II I I II I M I II II I II I I I I I I II I II II II II M I 
AT CGC CAGT CTGGCTCT GGGAGAC CT ACT GC AC AT CAT CAT AG AC AT AC C CAT T AACAC C 477 

TACAAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTC 77 4 

MUM I I I I I II I II I I M II I II I I I I I I I M I II I I I I I I I I M I I M I II I II 

TACAAGTT GCT CGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCTGGTGCCCTTC 537 
ATACAGAAAGC CT CCGT GGGAAT CACT GT GCT GAGTCT AT GT GCT CT GAGT ATT GACAGA 8 34 

Ill I I M I I I I I I II M I I I I II I I M I I I I I I 

ATACAGAAGGCTTCTGT GGGAAT CACAGT GCT GAGT CTTTGT GCT CTAAGT ATT GACAGA 597 

TATCGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCA 894 

II II II I I II I I I I I I I II MM I I II II II I I II I I I I M II II I 

TATCGAGCTGTTGCTTCTTGGAGTCGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCA 657 



Qy 


238 


Db 


1 


Qy 


298 


Db 


61 


Qy 


355 


Db 


121 


Qy 


415 


Db 


181 


Qy 


475 


Db 


241 


Qy 


535 


Db 


298 


Qy 


595 


Db 


358 


Qy 


655 


Db 


418 


Qy 


715 


Db 


478 


Qy 


775 


Db 


538 


Qy 


835 


Db 


598 



Qy 


895 


Db 


658 


Qy 


955 


Db 


718 


Qv 


1015 


Db 


778 


Qv 


1075 


Db 


838 


Qv 


1135 


Db 


898 


Qv 


1195 


Db 


958 


Qv 


1255 


Db 


1018 


Qv 


1315 


Db 


1078 


QV 


1375 


Db 


1138 


Qv 


1435 


Db 


1198 


Qy 


1495 


Db 


1258 


Qy 


1555 


Db 


1318 



GTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGT 954 

i | | | | | I I 1 M I I I I II I I II I II M I I I I I I I I I I I I I M I I I II I I I I 

GTAGAAATTGTTTTAATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCCGAAGCCATAGGT 717 

T T T GAT AT AAT T AC GAT GGAC T AC AAAGGAAGT T AT CT G C GAAT CTGCTTGCTT C AT CC C 1014 
| | | | I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I M I I I I I I I I 

T T T GAT AT GAT T AC GT C GGACT ACAAAG GAAAG C C C CT AAG GGT CT G CAT GCT T AAT CC C 777 

GTT C AGAAGAC AGCT T T CAT G C AGT T T T AC AAGAC AGC AAAAGATT GGT GGCT GT T CAGT 1074 

| | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I II I I II I I I 
T TT C AGAAAAC AGC CT T CAT GCAGT T T T AC AAGAC AGC CAAAGAT T G GT G GCT GT T CAGT 837 

TTCTATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAA 1134 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M III 

T T CTACT T CT G CTTGCCGCTAGC CAT CACT GCAGT CTTTTATACCCT GAT GACCTGCGAA 897 

AT GT T G AGAAAGAAAAGT GG CAT G C AGAT T GCT T T AAAT GAT C AC C T AAAGCAGAGAC GG 1194 

Ml | || I I I I I I I II I I I I I I I I I I I I I I 

AT GC T C AGGAAGAAGAGC GGT AT GC AGATT GCT T T GAAT GAT CACT T AAAG C AGAGAC GA 957 

GAAGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTT 1254 

| | I I I I M I I I II Mill I I I I I I I II I I II I I I I I I I I I I I I II I I I I I I I I 
GAAGTGGCCAAGACAGTCTTCTGCCTGGTCCTCGTGTTTGCTCTCTGTTGGCTTCCCCTT 1017 

CACCT CAGCAGGATTCT GAAGCT CACT CTTT ATAATCAGAAT GAT CCCAAT AGAT GT GAA 1314 

I M I I I I I I I I I I I I II I II I I II I I I I I I I I II I I I I I I I I I I I I 

C AC CT C AGC C G GAT CCT GAAG CT CAC C C T GT AT GACC AGAG CAAT C C AC AC AG GT GT GAG 1077 

CTTTTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCC 1374 

IN I I II I II I I II I I II I I I I I I II I I I I I II I I I I I M I M I I I Mil M I 
CTTCTGAGCTTTTTGTTGGTTTTGGACTACATTGGTATCAACATGGCTTCTTTGAACTCC 1137 

TGCATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAA7\ACTGCTTTAAGTCA 1434 

| | I I I M II I I I II II I M I II I II II I I I II I M M I II II II I II I II M M I II 

T GC AT CAAT C CAAT C G CT CT GT AT T T GGT GAGCAAAAGATT CAAAAACT G CT T T AAGT C A 1197 
T GCT TAT GCT GCT GGT GC CAGT CAT TTGAAGAAAAAC AGT CCTTGGAGGAAAAGCAGTCG 1494 

|| || M I II I II II II M I I II II Mill M II I I II II M II II II I I I I 

TGTTTGTGCTGCTGGTGCCAAACGTTTGAGGAAAAGCAGTCCTTGGAGGAGAAGCAGTCC 1257 
T GCT T AAAGT T CAAAGCT AAT GAT C AC GGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC 1554 

Ml | | || II M II II II II II I II II II I I I III M I II I II II I II I II I II I 

TGCCTGAAGTTCAAAGCCAACGATCACGGATATGACAACTTCCGGTCCAGCAATAAATAC 1317 

AGCTCATCTTGA 1566 

Mill II I II I 
AGCTCGTCTTGA 132 9 



RESULT 8 

AL571798/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 



AL571798 1201 bp mRNA linear EST 31-MAY-2003 

AL571798 Homo sapiens PLACENTA COT 2 5 -NORMALIZED Homo sapiens cDNA 
clone CS0DI030YM19 3-PRIME, mRNA sequence. 
AL571798 

AL571798 .2 GI: 31293189 



KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1201) 

Li,W.B., Gruber,C, Jessee,J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 

On Feb 16, 2001 this sequence version replaced gi : 12929453. 
Contact : Genos cope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies , a division of 
Invitrogen. This sequence belongs to sequence cluster 7006. r For 
more information about this cluster, see 
http : //www. genoscope . ens . f r/ 

cgi-bin/cluster.cgi?seq=CSODI030AG10NPl&cluster=7006. r. Contact : 
Feng Liang Email : fliang@lifetech.com URL : 

http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0DI030AG10NP1 . 

Location/Qualifiers 

1. .1201 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref= M taxon: 9606" 
/clone="CS0DI030YM19" 

/tissue_type="PLACENTA COT 25-NORMALIZED" 
/clone_lib= ,, Homo sapiens PLACENTA COT 25-NORMALIZED" 
/note="lst strand cDNA was primed with a Notl-oligo (dT ) 
primer. Five prime end enriched, double-strand cDNA was 
digested with Not I and cloned into the Not I and EcoR V 
sites of the pCMVSPORT 6 vector. Library was normalized . " 



ORIGIN 



Query Match 22.9%; Score 987; DB 9; Length 1201; 

Best Local Similarity 96.0%; Pred. No. 1.4e-179; 

Matches 1030; Conservative 10; Mismatches 28; Indels 5; Gaps 



3; 



Qy 


489 


Db 


1068 


Qy 


549 


Db 


1010 


Qy 


609 


Db 


953 


Qy 


669 


Db 


893 



II II : I M I 



I I I I I 1 I I I I I I I I I I I I I 



: I I I I I I I I I I M I I 
- AGAYTT CAAAT ACAT CAA 1011 



CACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAG 608 

Ml | I I : I : I I II I I I I I I I I I I I I I I I I I I I II : I I I I I I II I I I I I I M 

CAC-GTTKTKTCCTGCCTTGTGTTCGTGCTGGGATCATC — G G RAC T C C AC AC T T CT GAG 954 

AATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGC 668 

|| || | | M I I I II I II I I I I I I I I I I I I M M I I I I I I I I I M I II I I II I I II I I I I I I 

AAT TAT C T AC AAGAACAAGT G CAT GC GAAAC G GT C C CAAT AT CT T GAT C GC C AGCT T GGC 8 94 

TCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGC 728 

| | | I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I 

T CTGGGAGACCTGCTGCACATCGT CAT TGACATCCCTAT CAAT GTCTACAAGCTGCT GGC 834 



Qy 



72 9 AGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTC 788 



833 AGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTC 77 4 

789 CGT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGTATT GACAGATAT CGAGCT GTT GC 848 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
773 CGTGGGAAT CACT GT GCT GAGTCTATGT GCT CTGAGTATTGACAGATATCGAGCT GTT GC 714 

84 9 T T CTT GGAGT AGAAT T AAAG GAAT T G G GGT T CCAAAAT GGACAG CAGT AGAAAT T GT T T T 908 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
713 T T CTT GGAGT AGAAT T AAAG GAAT T GG GGT T CCAAAAT GGACAGCAGT AGAAAT T GT T T T 654 

909 GATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTAC 968 

I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

653 GATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTAC 594 

969 GAT GGACTACAAAGGAAGTTATCTGC GAAT CTGCTTGCTT CAT CCCGTTCAGAAGACAGC 1028 

II I I I I I I I M I I I I I I I I I I I II I I I I I I 1 I I I I I I I M I I 1 I! I I II I I I II I M I 1 I 

593 GAT GGACT ACAAAGGAAGT TAT CT G C GAAT CTGCTTGCTT CAT C C C GT T C AGAAGAC AGC 534 

1029 TTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTT 108 8 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I 
533 TTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTT 474 

1089 GC C AT T G GC C AT CACT GC AT T T T T TT AT ACACT AAT GAC CT GT GAAAT GT T GAGAAAGAA 1148 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
47 3 GC CATT GG C CAT CACT GC AT T T T T T TAT ACACT AAT GAC CT GT GAAAT GTT GAGAAAGAA 414 

1149 AAGT GGCAT GCAGAT T GCT T T AAAT GAT C AC CT AAAG C AGAGAC GGGAAGT GGC CAAAAC 1208 
I I M I I I I I I I M M I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I II I I I I I 
413 AAGT GGC AT GCAGAT T G CT TT AAAT GAT C AC CT AAAG C AGAGAC GGGAAGT GGC CAAAAC 354 

12 09 CGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGAT 1268 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
353 CGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGAT 294 

1269 T C T GAAGCT CACT CTT TAT AAT C AGAAT GAT CC C AAT AGAT GT GAACT T TT GAGCTT T C T 1328 
I I I II I I I I I I I I I I I I I I I I I I I M I I I I I II I I 1 I I I I I I I I I I I I I I I M I I I I II I 
293 T CT GAAGCT CACT CT TT AT AAT CAGAAT GAT CC CAAT AGAT GT GAACT T T T G AG CT T T CT 234 

1329 GTT GGT ATT GGACT AT ATT GGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCAAT 1388 
I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 1 
233 GT T G GT AT T GGACT AT ATT GGT AT CAAC AT G GCT T CACT GAAT T C CT G CAT T AAC C CAAT 174 

1389 TGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTG 1448 

I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I M I 
173 TGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTG 114 

1449 GTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAA 1508 

I I I I I I I II II I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
113 GTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAA 54 

1509 AGCT AAT GAT C ACGGAT AT GAC AACTT CC GT T C CAGT AAT AAAT ACAGCT CAT 1561 
I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I : : I I I I I : I : I 
53 AG C T AAT GAT NANN GAT AT GACAACT T C C GT T C C AGKNN BN B GT S CAGC KCVT 1 



AY415513 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



gene 



AY415513 1144 bp DNA linear GSS 17-DEC-2003 

Pan troglodytes EDNRB gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY415513 

AY415513. 1 GI: 39771472 
GSS. 

Pan troglodytes (chimpanzee) 
Pan troglodytes 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pan. 

1 (bases 1 to 1144) 

Clark,A.G., Glanowski , S . , Nielson,R., Thomas, P., Ke j ariwal , A. , 
Todd, M. A., Tanenbaum,D.M. , Civello, D.R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 1144) 

Clark, A. G. , Glanowski, S . , Nielson,R., Thomas, P., Kej ariwal, A. , 
Todd, M. A., Tanenbaum,D.M. , Civello, D. R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J . J. , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualifiers 
1. .1144 

/organism="Pan troglodytes" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9598" 
<1. .>1144 
/gene="EDNRB" 
/locus_tag="HCM5582" 



ORIGIN 



Query Match 21.8%; Score 936.6; DB 29; Length 1144; 

Best Local Similarity 82.1%; Pred. No. 7e-170; 

Matches 939; Conservative 0; Mismatches 205; Indels 0; Gaps 



0; 



Qy 


423 


Db 


1 


Qy 


483 


Db 


61 


Qy 


543 


Db 


121 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I II I I I I 11 I II 

GGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCC 60 
AC GC AC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAG GAGACT T T CAAAT A 542 

I I || I I I I I I I I I I I I I I I I I I i I I I I I I I I I I II I II I I I I I I I I I M I I M I I I I M I 

AC GC AC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAGGAGACTT T CAAAT A 120 

CATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACT 602 

IN || I I M I M I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I I I 
CATNNACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACT 18 0 



Qy 



603 T CT GAGAAT TAT C T AC AAGAACAAGT GC AT GC GAAAC GGT CC C AAT AT CT T GAT C GC C AG 662 



181 T CT GAGNN T TAT CT ACAAGAACAAGT G CAT G C GAAACG GT C C CAAT AT CT T GAT C G C C AG 24 0 

663 CTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCT 722 

1 I 1 I I I I I I I I I 1 

241 CTTGGCTCTGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCT 300 

723 G CT G GCAGAGGAC T GGC C AT T T G GAG CT GAGAT GT GT AAGCT GGTGCCTTT CAT ACAGAA 782 

| | | | I I | I I I I I I I II I I I I I ! I I I I I I M I I I I I I I I 1 I I I I I I I I I I I I I I I I M I I I 
301 GCT G G CAGAG GACT GGC CAT T T GGAGCT GAGAT GT GT AAG CT GGTGCCTTT CAT ACAGAA 360 

783 AGC CT CCGT GGGAAT CACT GT GCT GAGTCTAT GT GCT CT GAGTAT T GACAGATAT CGAGC 842 

| | | | | M | I I I I I I II II I I II I I I M I I I I I I I I M I I II I I I I I I I I I I M I M I I II 
361 AGCCT CC GT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGTAT T GACAGATAT CGAGC 420 

843 TGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAAAT 902 

MINI II I I I I I I I I I I I I II I I I I I I M M I I I M I I I I I I I I I M II I I I I I I I II 
421 TGTTGCATCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAAAT 480 

903 TGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATAT 962 

I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I 

481 TGTTTTGATTTGGGTGGTCTCTGTGGTTCTAGCTGTCCCTGAAGCCATAGGTTTTGATAT 540 

963 AATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAA 1022 

| | | | M | I I M I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I II I I II II 
541 AATTACGATGGACTACAAAGGAAGTTATCTACGAATCTGCTTGCTTCATCCCGTTCAGAA 600 

1023 GAC AGCT T T CAT GC AGT T T T ACAAGACAGCAAAAGATT GGT G G C T GT T C AGT TT CT AT TT 1082 
I I II I I I I II I I I I I I 

601 GACAGCTTTCATGCAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 660 

1083 CTGCTTGC CAT T GG C CAT CACT GC AT T T TT T T AT ACACT AAT GAC CT GT GAAAT GT T GAG 1142 

661 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 720 

1143 AAAGAAAAGT GGCAT GC AGAT T G CT T TAAAT GAT CAC CT AAAGC AGAGAC G GGAAGT GGC 1202 

I I II I I I II I I I I I 

721 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGACGGGAAGTGGC 780 

1203 CAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAG 1262 

| | | | I I | | | | | I I M I M I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
7 81 CAAAACTGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAG 840 

12 63 CAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAG 1322 
| | I I I I M I I I M I I I I I I I I I I I I I II I II II II I I I I I I II I I I I I I I I I M I I I I I I 
841 CAGGAT T CT GAAG CT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T TT GAG 900 

1323 CTTTCTGTT GGT AT T GGACT AT AT T G GT AT C AACAT G GC T T CACT GAAT T C CT G CAT T AA 1382 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I 
901 CTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAA 960 

1383 CCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATG 1442 

I I I I I II I | M | | I I I I II I I I I I I I I M II II I I I I II I I I M I I I I I I I M II I I I M 
961 C CC AATT GCT CT GTATT T GGT GAGCAAAAGATT CAAAAACT GCTTT AAGT CAT GCTT AT G 1020 

1443 CT GCT GGT G C CAGT CAT TTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGT GCTT AAA 1502 
I I I I II I M II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I M II 



Db 



1021 CTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAA 1080 



Qy 1503 GTTCAAAGCT7^ATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCATC 1562 

I I I M I I I I I I I I I I I I I I I i I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I i I I I I I 
Db 1081 GTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCATC 114 0 

Qy 1563 TTGA 1566 

I I I I 

Db 1141 TTGA 1144 



RESULT 10 

BI520706/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BI520706 957 bp mRNA linear EST 29-AUG-2001 

603071813T1 NIH_MGC_119 Homo sapiens cDNA clone IMAGE : 5163746 3 1 , 
mRNA sequence. 
BI520706 

BI520706.1 GI:15345498 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 957) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Life Technologies, Inc. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http: //image. llnl.gov 

Plate: LLAM11406 row: j column: 03 
High quality sequence start: 4 
High quality sequence stop: 954. 

Location/Qualifiers 

1. .957 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon:9606" 

/ clone= " IMAGE :5163746" 

/tissue_type="medulla" 

/lab_host="DH10B n 

/clone_JLib= M NIH_MGC_119" 

/note="Organ: brain; Vector: pCMV-SPORT6; Site_l : NotI; 
Site_2 : EcoRV (destroyed); RNA source normal medulla from 
anonymous male age 27. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.3 kb, insert size range 
0.9-3 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 013. Note: 
this is a NIH_MGC Library." 



ORIGIN 



Query Match 20.1%; Score 866; DB 12; Length 957; 

Best Local Similarity 96.5%; Pred. No. 2.8e-156; 

Matches 917; Conservative 0; Mismatches 30; Indels 3; Gaps 3; 

Qy 569 T GT T C GT G CT GG GG AT CAT C G GGAAC T C C AC ACT T CT GAGAAT TAT CT ACAAGAACAAGT 62 8 

| | I I I I I I M I I I II I I I I I III I I I I I I I I I I I I I M I I II I I I I I I I I I 
Db 956 TGTTCGTGCT GGG G AT CAT C C GG ACT T C C AC ACT T CT GAGAAT ACT C T AC AGAAC CAAG G 897 

Oy 62 9 GCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACA 68 8 

I MM mi MIIIMIIIMMIMM 

Db 896 CAATGCGAAACGGTCCCAATATCTGATCGCCAGCTTGGCTTCTGGGAGACCTGCTGCACA 837 

Qy 68 9 TCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCAGAGGACTGGCCATTTGGAG 748 

I I I I I I I I I 1 I I I MM I 1 I I I I I t I t I I I I t I I I I i i I I I ) Ml 

Db 836 T C GT CAT T GAC AT C C CT AT CAAGGT C T ACAAGCT G- T T G CAGAG GAC T GGC C AT T T GGAG 778 

Qy 749 CT GAGAT GT GT AAGCT GGT GCCTT T C AT ACAGAAAGCCT CCGT GGGAAT CACT GT GCT GA 808 

M I II II II I Ml M M I II I M I I M II M M 

Db 777 CT GAGAT GTGTAAGCT GGT GCCTTTCATACAGAAAGCCT CCGT GGGAAT CACT GT GCT GA 718 

Qy 809 GT CT AT GT GCT CT GAGT AT T GAC AGAT AT C GAGC T GTTGCTTCTT GGAGT AGAAT T AAAG 868 

| | || | | | || | I M II I II II II I I I I I M I I II II I M M II M I I M I II II 

Db 717 GT CT AT GT GCT CT GAGT AT T G AC AGAT AT CGAGCT GTTGCTTCTT GGAGT AGAATT AAAG 658 

Q y 8 69 GAATTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGG 928 

| M | || M I II I I I I I II M M I II I I I I M II I I I I I II II II I I II II I M I I I II II 

Db 657 GAATTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGG 5 98 

Qy 929 TTCTGGCTGTCCCT GAAGC CAT AG GT T T T GAT AT AAT T AC GAT GGACT ACAAAGGAAGT T 988 

| M II M I M II I I I I II M M I M II II M M II I I M I II M I II I I II I I II I II II 

Db 597 T T CT GG CT GT C C CT GAAGC CAT AG GT T T T GAT AT AAT T AC GAT GGACT ACAAAGGAAGT T 538 

Q y 98 9 ATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAAGACAGCTTTCATGCAGTTTTACAAGA 1048 

I M II I I I II I I I II M M II II I I M I I I M I M M II M M I II M II M 

Db 537 AT CT GC GAAT CTGCTTGCTT CAT C C C GT T CAGAAGACAGCT T T CAT GC AGT T T T ACAAGA 478 

Qy 1049 CAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCCATTGGCCATCACTGCAT 1108 

M | | | | || | II II I I II II I II II II I I I I I I I M I I I M II II I M II II II I II II II 

Db 477 CAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCCATTGGCCATCACTGCAT 418 

Qy 1109 T T - TT TT ATACACTAAT GACCT GT GAAATGT T GAGAAAGAAAAGT GGCAT GCAGATT GCT 1167 

I | I I I I I I II I I II II M I I I II I I M I II I M II I M I I II I M II 

Db 417 T T GTT TT AT ACACTAAT GACCT GT GAAAT GT T GAGAAAGAAAAGT GGCAT GCAGATT GCT 358 

Qy 1168 TTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACCGTCTTTTGCCTGGTCCTT 1227 

| | | || | | | | M M II II I I I I I M II I I II I I M I I I II I II II II II II II I M I M M 
Db 357 T T AAAT GAT CAC CT AAAGCAGAGAC GG GAAGT GGC CAAAAC CGTCTTTTGCCTGGTCCTT 2 98 

Qy 1228 GTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCTCACTCTTTAT 12 87 

I | | | | | | | || I I II I II I II II M II I I I II I M I I I II I I II I II M II II I II I M II 

Db 297 GTCTTTGCCCTCT GCT GGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCT CACT CTTTAT 238 

Qy 1288 AAT CAGAAT GAT C C CAAT AGAT GT GAAC T T T T GAG CT T T CT GT T GGT AT T GGACT AT AT T 1347 

| | | || | | | || I II M II I I I M M I I I I I II I I I I I I II I I M I II M II I II I II I I M 

Db 2 37 AAT CAGAAT GAT C C CAAT AGAT GT GAAC T T T T GAGCT T T CT GT T GGT AT T GGACT AT AT T 178 



Ov 


1348 


Db 


177 


Qy 


1408 


Db 


117 


Qy 


1467 


Db 


57 



G GT AT CAAC AT GGCT T C ACT GAAT T C CT G CAT T AAC C CAAT T GCT CT GT AT T T G GT GAG C 14 07 

I | I I I I I | | | | I I I I I I I I I I I I I I I I I I I I I I M I I I ! I M I I I I I I I I I I I I 

GGT AT CAAC AT G G CT T CAC T GAAT T C CT GC AT TAACC C AAT T G CT C T GT AT T T GGT GAGC 118 

AAAAGATTC7WWVCTGCTTTAAGTCATGCTTATGCT-GCTGGTGCCAGTCATTTGAAGA 1466 

| | | | | | | | | I | | I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I II I I I I I I I I I I I I 
AAAAGATTCAACAACT GCT TTAAGT CAT GCTTAT GCT GGCT GGT GCCAGTCATTTGAAGA 5 8 

AAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAAGCTAATG 1516 

| | I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
AAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTTACAGCTAAGG 8 



RESULT 11 

AL553041/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 

and Polayes,D. 



AL553041 1201 bp mRNA linear EST 31-MAY-2003 

AL553041 Homo sapiens PLACENTA COT 25-NORMALIZED Homo sapiens cDNA 
clone CS0DI072YK22 3-PRIME, mRNA sequence. 
AL553041 

AL553041.2 GI : 3127 48 55 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1201) 
Li,W.B., Gruber,C, Jessee,J. 

Full-length cDNA libraries and normalization 
Unpublished (2001) 

On Feb 15, 2001 this sequence version replaced gi: 12892503. 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 7006. r For 
more information about this cluster, see 
http: //www. genoscope . ens . f r/ 

cgi-bin/ cluster . cgi?seq=CS0Dl072BFllNPl&cluster=7006 . r . Contact : 
Feng Liang Email : fliang@lifetech.com URL : 
http://fulllength.invitrogen.com/ InVitroGen Corporation 
Faraday Avenue Genoscope sequence ID : CS0DI072BF11NP1 . 

Location/Qualifiers 

1. .1201 

/organism="Homo sapiens " 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="CS0DI072YK22" 

/tissue_type="PLACENTA COT 25-NORMALIZED" 
/clone_lib="Homo sapiens PLACENTA COT 25-NORMALIZED" 
/note="lst strand cDNA was primed with a Notl-oligo (dT) 
primer. Five prime end enriched, double-strand cDNA was 
digested with Not I and cloned into the Not I and EcoR V 
sites of the pCMVSPORT 6 vector. Library was normalized." 



1600 



ORIGIN 



Query Match 20.0%; Score 860.2; DB 9; Length 1201; 

Best Local Similarity 89.9%; Pred. No. 3.5e-155; 

Matches 966; Conservative 32; Mismatches 54; Indels 22; Gaps 7; 

Qy 3X79 AAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAA 3238 

INI I I I I I I I |::| : I I : I : I I I I : : : : : I I I : : I 

Db 10 61 AAATCCMAAATTAMCTTTTTTYYTTWAAWGKCKGKCCACWKTKKRAAAAK CTWKA 1007 

Qy 3239 TGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACT^ATGTGGC 3298 

II :||||| I I I I I I I I I I I I I I I I I I I I I I I : I 

Db 1006 AATTTKTTTCTTCCAATATTAATTTT T T TAC AT AAAC CAAAC C CAAC AAT T T KC C 952 

Qy 32 99 CAGAAAGAAAGAGCAAT AAT AAT T AAT T C ACACAC CAT AT GGAT T CT AT T T ATAAAT C AC 3358 

|| || : | : | | : I I I I I I I I I I : I : > I M I I : I I : I I I I I I I I I M I I I I 
Db 951 ATAAAKTAA — KKTCAWTATWATTAATT CACMCMCCATATKTAT YCTATTTATAAATCAC 894 

Qy 3359 C C ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACTTT T T CAG AGGC CT GTT AT C AT AGAA 3418 

1 I I I I : I I M : I I : I I M M I :: I I I I I I I I M I I I I I I I I I I : M I I I I I I I I I I I I I 
Db 893 C C AC AW ACT T KT T CTT T AAT T CMWT C C CAAT C ACT T T T T C AGAGKC CT GTT AT CAT AGAA 834 

Qy 3419 GT CAT T T T AGACT C T CAAT T T T AAAT T - AAT T T T GAAT CACT AAT AT T T T CAC AGTTT AT 3477 

I I I I I I I I II I I I I I I I I I I I I I I I I M 

Db 833 GT CAT T T T AGACT CT CAAT TT T AAAT T AAAT T T T GAAT CACT AAT ATT T T C ACAGT T TAT 774 

Qy 3478 T AAT AT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T T TAT TAC CAT GT ACT GAAT TT T T 3537 

| | | | | M I II I 11 I I I I I I I M I I I I I M I I I I I I I I M I I I I M I I I I I I I I I I I I I I I 
Db 773 T AAT AT ATT T AAT T T CT AT T T AAAT T T TAG AT TAT T T T T ATT AC CAT GT ACT GAATT T T T 714 

Qy 3538 ACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAA 3597 

| 1 | I I I I I I 1 I I I I I II I I I I I I 1 I I I I I I I I I II I I I I I I I I I I M I I I I II I M I I I I 
Db 713 AC AT C CT GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GT T CT CT AAT TAT CTT GC CAAA 654 

Qy 3598 T T T T GAAACT AC ACACAAAAAGCAT ACT T G CAT TAT T T AT AATAAAAT T GCAT T C AGT GG 3657 

| | | | I I I I I I I II I I I I I I I I I II I I I I I I I I M I I I M I I I I I I I II I I I I I I I I I I I 
Db 653 T T T T GAAACT AC ACACAAAAAGCAT AC T T G CAT TAT T T ATAAT AAAAT C GC AT T C AGT GG 594 

Qy 3658 CT T T T T AAAAAAAAT GT TT GAT T CAAAACT TT AAC ATACT GAT AAGTAAGAAACAAT TAT 3717 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I : I M I I I I I I I I I I I I I M I I I II II I I 
Db 593 CT TT T T - AAAAAAAT GT TT GAT T CAAAACT T T VAC ATACT GAT AAGTAAGAAACAAT TAT 535 

Qy 3718 AAT T T CT T TAC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAAC T T C AAAAC A 3777 

| | | | I II I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I M I I I I I I I I I I I I I M 
Db 534 AAT T T CT TN AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT T CAAAAC A 475 

Qy 3778 T GT T T C CT AGT AT T AAG GACT T T AAT AT AG CAAC AGAC AAAAT T ATT GT T AAC AT GGAT G 3837 

| | | | I M I I I I I I II I I I I I M I I I I I I : I I : II I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 474 T GT T T C CT AGT AT T AAG GACT T T AAT AT VG CMAC AGAC AAAAT TAT T GT T AAC AT GGAT G 415 

Qy 3838 T T AC AGCT CAAAAGAT T T AT AAAAGAT T T TAAC CT AT T TT CT C C CTT AT TAT CC ACT GCT 38 97 

| || I I II I I I I I I I I I I M I I I I I I I M I I I I M I I II I I I II I II I M I I I I 

Db 414 T T AC AG CT CAAAAGAT T TAT AAAAGAT T T TAAC CT ATT TT CT C C CTT AT TAT CC ACT GCT 355 

Qy 3898 AAT GT G GAT GT AT GT T CAAACAC CT T T T AGT AT T GAT AGCT T ACAT AT G G C CAAAGGAAT 3957 

| || I I I I 1 I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I M M I I I I I II I M I I I I M 
Db 354 AAT GT G GAT GT AT GTT CAAACAC C T T T T AGT AT T GAT AG CTT AC AT AT GGC CAAAGGAAT 295 



Qy 



3958 ACAGT T TAT AG CAAAAC AT G GGT AT GCT GT AG CT AACT T T AT AAAAGT GT AAT AT AAC AA 4017 



I I I I I I I I I I I I I I I I t I I I I I I I I I II I I I I I I M I I I I I I I I M M N I I I I I I I I M 

Db 294 ACAGTTT AT AGCAAAACATGGGT AT GCT GT AGCTAACT TT ATAAAAGT GTAATATAACAA 235 

Qv 4018 TGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTT^AAGTGGCTATAGTTACTGA 4 077 

| | I | I I I I I I I I M I I I I I M MM I I I I I I I I I I I M II M II I I II 

Db 234 TGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGA 175 

Qy 4 07 8 T T T T T TAT TAT GT AAG C AAAAC C AAT AAAAAT T T AAGT T T T T T T - AACAACT AC CT T ATT 4136 

| M | | 1 I I I M I I M II I I I I I M I I M M I I I II I M I I II I I I I 

Db 174 T T T T T TAT TAT GTAAGCAAAACC AAT AAAAAT T T AAGT x T T T T T CAACAAC T AC CT TAT T 115 

Qy 4137 T T T C ACT GT AC AGAC ACT AAT T CAT T AAAT ACT AA T T GAT T GTT T AAAAGAAA 4189 

I | | I I I I I I I I I M M M II I M I II I M I I II I I I I I I II I M I II I 

Db H4 TTTC ACT GTACAGACACTAATTCATT AAAT ACT CACACTCTCGCACTTGTTT AAAAGAAA 55 

Qy 4190 TAT AAAT GT GACAAGT GGACATT ATTT AT GTTAAAT ATACAATT AT CAAGCAAG 4243 

| | | | | | : I : I : I I I I I II M I I I I M I I I M M I I I I I M I I I M M I I II I M 
Db 54 TATAAAKGBGMCAAGT GGACATT ATTTAT GTTAAAT ATACAATT AT CAAGCAAG 1 



RESULT 12 

AL543805 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



AL543805 942 bp mRNA linear EST 31-MAY-2003 

AL543805 Homo sapiens PLACENTA COT 25-NORMALIZED Homo sapiens cDNA 
clone CS0DI005YG20 5-PRIME, mRNA sequence. 
AL543805 

AL543805.2 GI : 31265651 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 942) 

Li,W.B., Gruber,C, Jessee,J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 

On Feb 15, 2001 this sequence version replaced gi: 12876284. 
Contact : Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 7006. r For 
more information about this cluster, see 
http : //www. genoscope . ens . f r/ 

cgi-bin/cluster.cgi?seq=CSODI005BD10QPl&cluster=7006.r. Contact : 
Feng Liang Email : fliang@lifetech.com URL : 

http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0DI005BD10QP1 . 

Location/Qualifiers 

1. .942 

/organism="Homo sapiens" 
/mol__type="mRNA ,f 
/db_xref="taxon: 9606" 
/clone="CS0DI005YG20" 

/tissue_type="PLACENTA COT 25-NORMALIZED" 

/clone lib= n Homo sapiens PLACENTA COT 25-NORMALIZED" 



/note="lst strand cDNA was primed with a Notl-oligo (dT) 
primer. Five prime end enriched, double-strand cDNA was 
digested with Not I and cloned into the Not I and EcoR V 
sites of the pCMVSPORT 6 vector. Library was normalized." 

IGIN 

Query Match 19,8%; Score 851; DB 9; Length 942; 

Best Local Similarity 98.1%; Pred. No. 2.2e-153; 

Matches 876; Conservative 4; Mismatches 11; Indels 2; Gaps 2; 

1803 T TT AC AGT T AGCACT T C AAC AT AG CT CT T AACAACT T C C AGGAT AT T CAC ACAAC ACTT A 1862 
|| | | I I I I I II I I I I I I I M ! I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I 
51 T T C C C GGGAT G CACT T CAAC AT AGCT C TT AACAACT T C C AG GAT ATT CAC ACAACACTT A 110 

1863 GGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA7\A 1922 
I I I I M I I I I I II I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
111 GGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTAAA 170 

1923 T CAAT GGGACT CT GAT AT AAAGGAAGAATAAGT CACT GT AAAACAGAACTTTT AAAT GAA 1982 

|| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I M I I I 
171 TCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATGAA 230 

1983 GCTT AAAT TACT CAAT T T AAAAT T TT AAAAT C C T T TAAAAC - AACT TTT CAAT T AAT AT T 2041 

I | | I I I I I I M I I II I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I 
231 GCTTAAATTACTCAATTTAAAATTTTT^AAATCCTTTAAAACTAACTTTTCAATTAATATT 290 

2042 AT CAC ACT AT TAT CAGAT T GT AAT TAGAT GCAAAT GAGAGAGCAGTT T AGTT GT T GC AT T 2101 
| | | || | I I I I I I I II I I I I I I I i I I I I II i I I I I I I M I I I I I II I I I I I I I I I I I I I I I 
291 ATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCATT 350 

2102 TTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGTT 2161 
I I I I I I M I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I 
351 T TT C GGAC ACT GGAAAC AT T T AAAT GAT CAGGAGGGAGT AAC AGAAAGAGCAAGGCT GTT 410 

2162 TTT GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT C AGC AT T CT GCAAT AT GTAAC C 2221 
I | M I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I 
411 TTT GAAAAT CAT T AC ACT T T CACTAGAAGC C CAAACCT C AGC AT T CT GCAAT AT GT AACC 470 

2222 AAC AT GT CACAAACAAGC AGCAT GTAAC AGACT GGC AC AT GT GC C AGCT GAATT TAAAAT 2281 

I I I M I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I : I I I I 
471 AACAT GT C ACAAACAAG C AGC AT GTAAC AGACT G GCAC AT GT GC CAG C T GAAT T T WAAAT 530 

2282 AT AAT AC TTT T AAAAAGAAAAT TAT T AC AT C CT T TAC AT T C AGT T AAGAT CAAAC CT CAC 2341 

II I I I I I I I I I I I I I I I : I I I M I I I I I I I M I I I I I M I I I I I I I II I I I M I I II II 
531 ATAAT ACT T T T T AAAAGARAAT T ATT AC AT C CT T TAC AT T C AGT T AAGAT CAAAC CT CAC 590 

2342 AAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCAC 2401 

I I II I II I I I I I I I I I II I I I I I I I I I I II I I I I I I I M I I I M I I I I I I I I I Ml I 
591 AAAGAGAAAT AGAAT GT TT GAN AG G CT AT C C CAAAAGACT T T T TT GAAT C T GT CN T T CT C 650 

2402 AT AC C CT GT GAAG ACAAT ACT AT CT AC AAT T TT T T C AGGAT TAT TAAAAT CTTCTTTTTT 24 61 

II II I I I I I I I I I I I I M I M I I I I I I I I I I I I II I I I I I I I I I I I I : I I I I I I I I Ml 
651 AT AC C C T GT GAAGACAAT ACT AT CT ACAAT T T T T T CAG GAT TAT T AAMAT CTTCTTCTTT 710 

24 62 CACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACAC 2521 

II I I I I I I I I I I I I I I I I II I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
711 CACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACAC 770 



Qy 2522 T GCAT GT AGAT GATTAAAT GAGGGCAGGCC CT GT GCT CATAGCTTTACGAT GGAGAGAT G 2581 

I I I I I I I I I I I I M I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 771 T GCAT GT AGAT GATTAAMT GAGGGCAGGCC CT GT GCT CAT AGCTTTACGAT GGAGAGAT G 830 

Qy 2582 C C AGT GAC CT C AT AAT AAAGAC T GT GAACT GC C T GGT GCAGT GT C CAC AT GACAAAG G G G 2641 

I I I I I I I II M I I I I I I I I I I I I I I II I I I 1 I I I I I I I I I I M I I I I 

Db 831 C CAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GACAAAGGGG 890 

Qy 2642 CAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATG 2694 

I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 8 91 CAGGTAGCACCCTCTCTCACCCATGCTGTGGTT-AAATGGTTTCTAGCATATG 942 



RESULT 13 

BQ229233 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



BQ229233 891 bp mRNA linear EST 02-MAY-2002 

AGENCOURT_7511051 NIH_MGC_72 Homo sapiens cDNA clone IMAGE: 6055288 
5', mRNA sequence. 
BQ229233 

BQ229233. 1 GI: 20410633 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 891) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: ATCC/DCTD/DTP 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : //image . llnl . gov 

Plate: LLAM13315 row: e column: 17 

High quality sequence stop: 696. 
Location/ Qualifiers 
1. .891 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 6055288" 
/tissue__type= "melanotic melanoma" 
/lab_host="DHl0B (phage-resistant ) " 
/clone_lib="NIH_MGCJ72" 

/note="0rgan: skin; Vector: pCMV-SPORT6; Site_l : NotI; 
Site_2: Sail; Cloned unidirectionally . Primer: Oligo dT. 
Average insert size 2 kb . Library constructed by Life 
Technologies . 11 



ORIGIN 



Query Match 



19.7%; Score 848; DB 13; Length 891; 



Best Local Similarity 99.1%; Pred. No. 8.3e-153; 

Matches 884; Conservative 0; Mismatches 5; Indels 3; Gaps 



Qy 208 6 GTT T AGT T GT T GC AT T T T T C GGACAC T GGAAAC AT TT AAAT GAT C AG G AGGGAGT AAC AG 2145 

M I I I I M I I I I I II I I I I I I I I I I I M II II I I I I I I M I I I I I I I I I M I I I I I I i I I 
D b 1 GTT T AGT T GT T GC AT T T T T C G GACAC T G GAAAC AT T T AAAT GAT C AGGAGGGAGT AAC AG 60 

Qy 2146 T^AAGAGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCAT 2205 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I 
Db 61 AAAGAGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCAT 120 

Qy 2206 T CT G CAAT AT GT AAC CAAC AT GT CACAAAC AAGC AGC AT GT AACAGACT GGC ACAT GT G C 2265 

|| M I I I I I I I I I I I I I I I I I I I I I I II M I II M I I I M M M I I I I I I I I I I I I I I I I 

Db 121 T CT G CAAT AT GT AAC CAAC AT GT CACAAAC AAGC AGC AT GT AACAGACT GGCAC AT GT GC 180 

Qy 2266 C AG CT GAAT T TAAAAT AT AAT ACT TT T AAAAAGAAAAT TAT T ACAT C CT T T AC AT T C AGT 2325 

| | | | | | | | | I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 181 C AGC T GAAT T TAAAAT AT AAT ACT TT T AAAAAGAAAAT TAT T ACAT C CT T T AC AT T C AGT 24 0 

Qy 2326 TAAGAT CAAAC CT CACAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAG ACT T T T T 2385 

| I I I M I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
Db 241 TAAGATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTT 300 

Qy 238 6 T GAAT CT GT CAT T CAC AT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T CAG GAT TAT 24 45 

| | || | | | | | M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I M I I I I I I M I I 
Db 301 T GAAT C T GT CAT T CAC AT AC C CT GT GAAGACAAT ACT AT CT AC AAT T TT T T CAGGAT TAT 360 

Qy 2446 TAAAAT CTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT 2505 

| | | | M I I I II I I I I I I I M M I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I II I 
Db 361 T AAAAT CTT CTTCTTTCACT AT CGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT 42 0 

Qy 2506 ACTTACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCT 2565 

M I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I M I I II I I I I I I I I 
Db 421 ACTTACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCT 480 

Qy 2566 TTACGAT GGAGAGAT GCCAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT 2625 

| | | | | | | | M | M II I I I I I I I I I I M II I I I I I I I I I I I I M 

Db 481 T T AC GAT GGAGAGAT GC C AGT GAC CT CAT AAT AAAGAC T GT GAACT GC CT GGT GCAGT GT 54 0 

Qy 2 62 6 CCACATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTC 2685 

| | | | | | I I I M I I I I I I I I I I I I I I I I I I I II II I I I M I I I I I I I I I I I I I I II I I I I I 
Db 541 CCACAT GAC AAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTT AAAAT GGTTTC 600 

Qy 2686 TAGCATATGTATAATGCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACA 2745 

M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I M I 
Db 601 T AGCAT AT GT AT AAT GCT AT AGT TAAAAT ACT AT T T T T CAAAAT C AT ACAGAT T AGT AC A 660 

Qy 274 6 TTTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAAT 28 05 

| M M I I I II I I I I I I I I I I I I M I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 661 TTTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAAT 72 0 

Qy 28 06 AGAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGC-AAAACTGCTTTTTGAG 28 64 

|| | | || I II I I I I I I I II I I I I I I I I I M I I I I I I I I I I I M I I I I I I I 

Db 721 AGAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAAACTGCTTTTTGAG 7 80 

Qy 28 65 ACCGT7WVGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAG 2924 

| | I I I I I I I I I I I M I I M I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I M 



781 ACCGTAAGAACCTCTTACCTTTGTGCGTTCCTGCCTAA-TTTT7VAATCTTCTAAGCAAAG 839 



Qy 



Db 



2925 T G C C T T AGG AT AG CT T G GG - AT GAGAT GT GT GT GAAAGT AT GT ACAAG AGAA 2975 

| | I I I I I M | I I I I I I I I I Mill I I I I I I I 

840 T GCCTT AGGAT AGCT T GGGAAT GAGAT GT GT GT GAAAAT AT GT AC AAGAAAA 891 



RESULT 14 

AL546465 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Craniata; Vertebra ta; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 

and Polayes,D. 



FEATURES 

source 



AL546465 1201 bp mRNA linear EST 31-MAY-2003 

AL546465 Homo sapiens PLACENTA COT 25-NORMALIZED Homo sapiens cDNA 
clone CS0DI030YM19 5-PRIME, mRNA sequence. 
AL546465 

AL546465.2 GI: 312 68299 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1201) 
Li,W.B., Gruber,C, Jessee,J. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 

On Feb 15, 2001 this sequence version replaced gi: 12879606. 
Contact : Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 7006. r 
more information about this cluster, see 
http : //www. genoscope . ens . f r/ 

cgi-bin/cluster.cgi?seq=CSODI030AG10QPl&cluster=7006.r. Contact : 
Feng Liang Email : fliang@lifetech.com URL : 

http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0DI030AG10QP1 . 

Location/Qualifiers 

1. .1201 

/organism= n Homo sapiens" 
/mol_type= M mRNA" 
/db_xref="taxon: 9606" 
/clone="CS0DI030YM19" 

/tissue_type="PLACENTA COT 25-NORMALIZED" 
/clone_lib="Homo sapiens PLACENTA COT 25-NORMALIZED" 
/note="lst strand cDNA was primed with a Notl-oligo (dT) 
primer. Five prime end enriched, double-strand cDNA was 
digested with Not I and cloned into the Not I and EcoR V 
sites of the pCMVSPORT 6 vector. Library was normalized . 11 



For 



ORIGIN 



Query Match 19.0%; 
Best Local Similarity 97.2%; 
Matches 854 ; Conservative 



Score 816.2; DB 9; 
Pred. No. le-146; 
7 ; Mismatches 15 ; 



Length 1201; 
Indels 3; 



Gaps 



Qy 



Db 



17 8 TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 237 

|| || I I | | M I I I I I I II II I I I I I I I 11 I II II I I I I I I I I I I I I I I I I I I I I I I 
219 TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 27 8 



238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 297 

| | I | I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
279 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 33 8 

298 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 357 

| | | | | I I I I II I I M I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I 
339 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 398 

358 C AAAC C GC AGAGAT AAT GAC G C C AC C C ACT AAGAC CT T AT GGC C CAAG GGT T C CAAC GC C 417 

II I I I I I I I M I I I M I I II I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
399 CAAAC C G C AGAGAT AAT GAC GC C AC C C ACT AAGAC CT TAT GGC C CAAGGGT T C CAAC G C C 458 

418 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 477 

I II I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
459 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 518 

47 8 C C GC C AC GCAC C AT CT CCCCTCCCCCGT GC CAAGGAC C CAT C G AGAT CAAGGAGACT T T C 537 

I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
519 C C GC CACG C AC CAT CTCCCCTCCCCCGT GC CAAG GAC C CAT C GAGAT CAAGGAGACT T T C 57 8 

538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 

I I I I I I I I I I I I I I M I I I I I I I M II I I I I I I I I I I I I I M I I I I I I I I I I I I I 

579 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 638 

598 ACACTT CT GAGAATTAT CTACAAGAACAAGT GCAT GC GAAACGGT CCCAATAT CTT GAT C 657 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M 
639 AC ACT T CT GAGAATTAT CTACAAGAACAAGT GCAT GC GAAAC GGT C C CAAT AT CTT GAT C 698 

658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

M | | | | | || I I I I I I I I II I I I I II I I I I I II I I I I II I I I I I M I I I I I I I I I I I I I I I 
699 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 758 

718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

I I I I I I I I I M I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I 
759 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 818 

77 8 CAGAAAGC CT C C GT GG GAAT C ACT GT GC T GAGT CTAT GT GCT CT GAGT AT T GAC AGAT AT 837 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M M I I I I I 
819 CAGAAAGC CT C C GT GG GAAT CACT GT G CT GAGT CTAT GT G CT CT GAGT AT T GAC AGAT AT 878 

838 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 897 

| I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I II I I II I I I I I II I I M I I I I I I 
879 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTT-CAAAATGGACAGCAGTA 937 

898 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

| | | | | | | | | I I : I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I M 
938 GAAATTGTTTTKATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTVTT 997 

958 GAT AT AAT T AC GAT GGACT ACAAAGGAAGT T AT CT GC GAAT CTGCTTGCTT CAT C CC GTT 1017 

I I I I I I I I I I I I I I I I I M I I I I II I M I I I I I II I I II I I I I I I I I I I I : : M I : I 
998 GATATAATTACGATGGACTACAAAGG-AGTTATCTGCGAATCTGCTTGCTT-MWCCCSGT 1055 

1018 c AG AAGAC AGCT T T CAT G C AGT T T T AC AAGAC AG C AAAA 1056 

: I : I I I I I I I II I I I I I I I II I I II : I I I I 

1056 YARAAGAAAGCTTTCATGCAGTTTACAAAAMAGCAAAAA 1094 



RESULT 15 

BI858627 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BI858627 972 bp mRNA linear EST 10-OCT-2001 

603389094F1 NIH MGC__87 Homo sapiens cDNA clone IMAGE: 5398054 5 T , 
mRNA sequence. 
BI858627 

BI858627.1 GI:15999374 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 972) 

NIH-MGC http : / /mgc . nci . nih . gov/ . 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r @mail . nih . gov 

Tissue Procurement: DCTD/DTP 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http: //image. llnl.gov 

Plate: LLAM12014 row: 1 column: 23 
High quality sequence stop: 8 99. 
Location/Qualifiers 
1. .972 

/organism="Homo sapiens " 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 5398054" 

/tissue_type="mammary adenocarcinoma, cell line" 
/lab_host="DH10B (phage-resistant ) " 
/clone_lib="NIH_MGC_87" 

/note="Organ: breast; Vector: pCMV-SPORT6; Site_l: NotI; 
Site_2: Sail; Cloned unidirectionally; oligo-dT primed. 
Average insert size 1.383 kb. Library enriched for 
full-length clones and constructed by Life Technologies. 
Note: this is a NIH_MGC Library." 



ORIGIN 



Query Match 18.8%; Score 808.6; DB 12; Length 972; 

Best Local Similarity 96.3%; Pred. No. 3.1e-145; 

Matches 903; Conservative 0; Mismatches 25; Indels 10; 



Gaps 



Qy 

Db 

Qy 

Db 



321 GAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGCAGAGATAATGACGCC 
| | | | | | | | I M I I I I I I I I I I M I 1 I I II I I I II I I I I I I I M I I I I I I I M I I I I I I M 
1 G AGAG GCTTCCCGCCT GAC AG GGC C ACT CCGCTTTT GCAAAC C G C AGAGAT AAT GAC GC C 

381 ACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTCGTTGGCACC 

| | | | | | || | | I I I M I II I I 1 I I M I II I I I I I I M I I I 1 I I I I I I I M I II I i I I I M I 
61 ACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTCGTTGGCACC 



380 



60 



440 



120 



Qy 



441 TGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCATCTCCCCTCC 500 



Db 


121 


MM | | | M I I II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M IN 

TGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCATCTCCCCTCC 


180 


Qy 


501 


CC C GT G C CAAG GAC C CAT C GAGAT CAAG GAGACT T T CAAAT AC AT CAAC AC GGT T GT GT C 

MM i | | II II 1 I I 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

C C C GT G C CAAG GAC C CAT C GAGAT CAAG GAGACT T T CAAAT AC AT CAAC AC GGT T GT GT C 


560 


Db 


181 


240 


Qy 


561 


CTGCCTTGTGTTCGTGCTGGG GAT CAT C GGGAACT C C AC AC T T CT GAGAAT TAT CT ACAA 

M | 1 1 1 1 1 1 1 1 1 1 | I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 II 1 1 

CTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGAATTATCTACAA 


620 


Db 


241 


300 


Qy 


621 


GAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCT 

M | II 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 
GAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCT 


680 


Db 


301 


360 


Qy 


681 


GCT GCACAT CGT CATT GACAT CCCT AT CAAT GTCTACAAGCT GCT GGCAGAGGACT GGCC 

1 | | | | M | | I I I I I I I 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

GCT GCACAT C GT CAT T GACAT C C CT AT CAAT GT CT ACAAG CT GCT GGC AGAG G ACT GGCC 


740 


Db 


361 


420 


Qy 


741 


ATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTCCGTGGGAATCAC 

| | | || | | | | M 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 M 1 1 1 
AT T T GGAGCT GAGAT GT GT AAG CT GGT G C CT T T CAT ACAGAAAGC CT C C GT G GGAAT CAC 


800 


Db 


421 


480 


Qy 


801 


T GT GCT GAGT CT AT GT GCT CT GAGT ATT GACAGATAT CGAGCT GTT GCTT CTT GGAGT AG 

| I II 1 1 1 1 1 II 1 1 1 M 1 M II 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

T GT GC T GAGT CT AT GTGCTCT GAGT ATT GACAGATAT CGAGCT GTT GCTT CTT GGAGT AG 


860 


Db 


481 


540 


Qy 


861 


AAT T AAAG GAA- TTGGGGTTC CAAAAT GGAC AGCAGT AGAAAT T GT T T T GAT T T GGGT GG 

| M 1 1 1 1 M 1 1 1 1 1 1 1 M 1 II 1 1 1 1 M 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 II 1 1 1 M II 1 II 1 

AAT T AAAGGAACT T GGGGT T C CAAAAT GGAC AGCAGT AGAAAT C G - T T T GAT T T GGGT GG 


919 


Db 


541 


599 


Qy 


920 


TCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACGATGGACTACA 

|| | | || | || M 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 

TCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGG-TTTGATATAATTACGATGGACTACA 


979 


Db 


600 


658 


Qy 


980 


AAGGAAGTTATCTGCGAATCT GCTT GCTT CAT CCCGTTCAGAAGACAGCTTT CAT GCAGT 

I | | | | | | | | | I I 1 1 II 1 1 1 1 II 1 1 1 1 1 II 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 

AAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAAGACAGCTTTCATGCAGT 


1039 


Db 


659 


718 


Qy 


1040 


TTTACAAGACAGCAAAAGATT GGT -GGCT GTT CAGTTTCTATTTCT GCTT GCCATT- GGC 

1 | | | | | | | | I I I 1 1 1 M II II 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 

TTTACAAGACAGCAAAAGATT GGT GGGCTATTCAGTTTCTATTTCT GCTT GCCATNGGGG 


1097 


Db 


719 


778 


Qy 


1098 


CAT C ACT G CAT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T GAGAAAGAAAAGT GG C AT 

1 | | | | | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 MINI 1 

CAT CACT GC AT T T T T T TAT ACACT AAT GAC CT GT GACAT GT T GAGAAC GAAC AGT GGCT T 


1157 


Db 


779 


838 


Qy 


1158 


G C AGAT T GCT T T AAAT GAT C AC CT AAAG C AGAGACGG GAAGT G GC C AAAAC C GT CT T T T G 

| | | I I I II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 Ml 1 1 1 1 1 

G C AGAT - -GCTT T AAT GAT C AC CT AAAGC AGAGACGGAA GT G GCAAAAC CGTCTTTG 


121 / 


Db 


839 


893 


Qy 


1218 


CCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTC 1255 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 

CCTGGTCCCTGTCTTTGCCCTCTGCTGGGTTCCCTTAC 931 




Db 


894 





Search completed: May 14, 2004, 15:46:36 



Job time : 10214.3 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: May 13, 2004, 23:17:23 ; Search time 16247.7 Seconds 

(without alignments) 
11473.517 Million cell updates/sec 



Title: US-09-931-157-2 
Perfect score: 4301 

Sequence: 1 gagacattccggtgggggac ctgggaaaaaaaaaaaaaaa 4301 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3470272 seqs, 21671516995 residues 

Total number of hits satisfying chosen parameters: 6940544 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



GenEmbl : * 


1: 


gb_ba : * 


2: 


gb htg:* 


3: 


gb in:* 


4: 


gb_om: * 


5: 


gb_o v : * 


6: 


gb_pat : * 


7: 


gb_ph : * 


8: 


gb_j?l : * 


9: 


gb_pr : * 


10: 


gb_ro : * 


11: 


gb__sts : * 


12: 


gb_sy : * 


13: 


gb un : * 


14: 


gb vi : * 


15: 


em_ba : * 


16: 


em fun:* 


17: 


em hum:* 


18: 


em_in : * 


19: 


em__mu : * 


20: 


em__om : * 


21: 


em__or : * 


22: 


em__o v : * 


23: 


emjoat : * 


24: 


em_ph : * 


25: 


em__pl : * 


26: 


ein_ro : * 


27: 


em sts:* 



2 8 : em_un : * 

2 9 : em__vi : * 

30: em_htg_hum:* 

31: em__htg_inv: * 

32 : em__htg_other : * 

33: em_htg_mus:* 

34: em__htg_pln: * 

35: em__htg_rod: * 

3 6 : em_htg_mam : * 
37: em_htg_vrt : * 
38: em_sy:* 

39 : em__htgo_hum: * 

40 : em_htgo__mus : * 

41: em_htgo_other : * 

Pred No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 


1 


4301 


100.0 




4301 


6 


AR177880 


2 


4301 


100. 0 




4301 


6 


E07650 


3 


4286 


99.7 




4286 


9 


HUMETR 


4 


4284.4 


99. 6 




4286 


6 


AX548828 


5 


4284.4 


99.6 




4286 


6 


AX587707 


6 


4284.4 


99. 6 




4286 


9 


S57283 


7 


2857 


66.4 




2972 


9 


D13162S7 


c 8 


2841.8 


66.1 


11 


33337 


9 


AL139002 


c 9 


2792.4 


64.9 


201093 


2 


AC144750 


10 


2610 


60.7 




2720 


11 


G06417 


c 11 


2550 


59.3 


169751 


2 


AC130785 


c 12 


2550 


59.3 


1 


35870 


2 


AC129069 


13 


1691.8 


39.3 




1873 


6 


AR165435 


14 


1691.8 


39.3 




1873 


6 


E15242 


15 


1690.8 


39.3 




1872 


9 


S44866 


16 


1495.4 


34.8 




1719 


9 


HUMEDNRB 


17 


1466.8 


34.1 




1470 


6 


AR270640 


18 


1466. 8 


34.1 




1470 


9 


HUMETSR 


19 


1389 


32.3 




1603 


9 


BC014472 


20 


1389 


32.3 




1632 


6 


AX342673 


21 


1361.4 


31.7 




1765 


9 


AF114165 


22 


1327.4 


30.9 




1329 


9 


AY275463 


23 


1322.6 


30. 8 




1329 


6 


AX280873 


24 


1222.8 


28.4 




1669 


4 


AF019072 


25 


1220.4 


28.4 




1578 


9 


HSX99250 


26 


1197.8 


27.8 




2026 


4 


BOVEETBR 


27 


1186 


27.6 




1452 


4 


AF034530 


28 


1113 


25.9 




2018 


10 


S65355 


29 


1110.2 


25.8 




1551 


6 


E05930 


30 


1104.8 


25.7 




2115 


10 


BC026553 


31 


1099.2 


25.6 




1958 


6 


AX305434 


32 


1099.2 


25.6 




1958 


10 


MMU32329 


33 


1091 


25.4 




1892 


10 


RNETBREC 



Description 



AR177880 Sequence 
E07650 cDNA encodi 
D90402 Homo sapien 
AX548828 Sequence 
AX587707 Sequence 
S57283 Homo sapien 
D13168 Homo sapien 
AL139002 Human DNA 
AC144750 Pan trogl 

GO 64 17 human STS W 
AC130785 Papio anu 
AC129069 Papio anu 
AR165435 Sequence 
E15242 Human mRNA 
S44866 ETB endothe 
L06623 Homo sapien 
AR270640 Sequence 
M74 921 Human endot 
BC014472 Homo sapi 
AX342673 Sequence 
AF114165 Homo sapi 
AY275463 Homo sapi 
AX280873 Sequence 
AF019072 Equus cab 
X99250 H. sapiens m 
D90456 Bos taurus 
AF034530 Canis fam 

S65355 nonselectiv 
E05930 DNA sequenc 

BC026553 Mus muscu 
AX305434 Sequence 

U32329 Mus musculu 

X57764 Rat mRNA fo 





34 


1091 


25 . 


4 


1965 


b 


xr n *5 £9 "3 




35 


1086. 6 


25 . 


3 


1311 


4 


a inn Qponn 
Ar ujo^uu 




36 


1070.4 


24 . 


9 


1321 


6 


AKZ U / 4 ^ D 




37 


1067 . 6 


24 . 


8 


1314 


4 


Ar Z / o 4 Z / 




38 


1042 . 8 


24 . 


2 


1326 


4 


TV TpO /l ^ j4 £ Q 

Ar Z 4 O 4 D .7 


c 


39 


931. 8 


21 


7 


135327 


2 


AC lloDo / 


c 


40 


931 . 8 


21 


1 




9 




c 


41 


922.2 


21 


4 


192330 


2 


AC122157 




42 


746 


17 


.3 


1564 


5 


AF472616 




43 


732. 6 


17 


.0 


1041 


5 


CCEDNRB 




44 


588 


13 


.7 


588 


11 


G15922 




45 


564.8 


13 


.1 


1520 


5 


AF275636 



E03623 DNA encodin 
AF038900 Equus cab 
AR207426 Sequence 
AF27 6427 Canis fam 
AF245469 Oryctolag 
AC118537 Felis cat 
AC123546 Felis cat 
AC122157 Canis fam 
AF472616 Gallus ga 
X99295 C.coturnix 

G15922 human STS C 
AF275636 Danio rer 



ALIGNMENTS 



RESULT 1 

LOCUS ^ ^ AR177880 4301 bp DNA linear PAT 17-DEC-2001 

DEFINITION Sequence 3 from patent US 6313276. 
ACCESSION AR177880 

VERSION AR177880 . 1 GI : 17920235 

KEYWORDS 

SOURCE Unknown. 
ORGANI SM Unknown . 

Unclassified. 
REFERENCE 1 (bases 1 to 4301) 

AUTHORS Imura,H., Nakao,K. and Nakanishi f S. 
TITLE Human endothelin receptor 

JOURNAL Patent: US 6313276-A 3 06-NOV-2001; 
FEATURES Location/Qualifiers 
source 1. .4301 

/organism= fl unknown n 
/mol_type="unassigned DNA" 

ORIGIN 

Query Match 100.0%; Score 4301; DB 6; Length 4301; 

Best Local Similarity 100.0%; Pred. No. 0; n 

Matches 4301; Conservative 0; Mismatches 0; Indels 0; Gaps U; 



Qy 


1 


GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

| | | | | | M | | | M | | M I I 1 1 1 1 II 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 

GAGACATT C C G GT GG GGGAC T C T G GC C AGC C C GAGCAAC GT GGAT C CT GAGAGC ACT C C C 


60 


Db 


1 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

1 I | | | I M | | M 1 1 1 1 1 M 1 1 1 1 M 1 1 M 1 1 1 II 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


61 


120 


Qy 


121 


AG GAT CAAC AC AGT G G CT GAACACT GGGAAG GAACT GGT ACTT GGAGT CT G G AC AT CT GA 

1 | | m | | || | | | | | I I II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AGGAT CAACACAGTGGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACATCTGA 


180 


Db 


121 


180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

|| | M | | 1 | | || 1 1 II 1 1 1 1 II M M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


181 


240 



241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 



961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

Ml Ml III I Mil II I I III MINIMI IMMMM INN 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 
1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 



1081 
1141 
1141 
1201 



| I I | | | | | | | | | | | | | I I I I I I 1 I M I I I I I I I I I I I I I M M I I I I I I I I M I I 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 i 1 1 M 1 1 1 1 M l 

AGAAAGAAAAGT GG C AT GCAGAT T GCT T T AAAT GAT CAC CT AAAG C AGAG AC GG GAAGT G 



GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

| | | | M I I I I I M M M I II I M I I I I I I I M I I I I I I I I I I I I I I M II I II I II I I I I 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



12 61 AGC AGGAT T CT GAAGCT CACT CT T T AT AAT CAGAAT GAT C CCAAT AGAT GT GAACT T T T G 

I I I M I I I I M I I I M I I I I I M I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I N I 

1261 AGC AGGAT T CT GAAGCT CACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT TT T G 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

1 | | I | I 1 t I I I I I t I I I t I I I I t 1 I t 1 I I 1 I I I t 1 I I I I I 1 I I I I I I I I t I I I I I I I I I I 
AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1321 
1381 
1381 
1441 
1441 
1501 



AAC C CAAT T GCT CT GT AT T T GGT GAGCAAAAGAT T CAAAAACT G CT T T AAGT CAT G CT T A 

1 | I I I I I I I 1 I I I I I I M I M I I Mil I Ml ,1 I I I I I M I I ! M I I Mil I I', 

AAC C CAAT T GCT CTGTATTT GGT GAG CAAAAGATT CAAAAACT GCT TTAAGT CAT GCT T A 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| | | | | | | | | M | | | || I I I II I I I I I I I I I I I I I I M II I I I I I I M I I I I I I I I I I I II 
T GCT GCT GGT GCCAGT CATTT GAAGAAAAACAGT CCTT GGAGGAAAAGCAGT C GT GCTT A 



AAGT T CAAAGCT AAT GAT CAC GGAT AT GACAACT T C C GT T C C AGT AAT AAAT AC AGCT C A 

I I I I I | | | | I M I I I I I I I I I I M I I I I I I I I I I M I I M I M I I I I I M I M I I I I I I I 

1501 AAGT T CAAAGCT AAT GAT CAC GGAT AT GAC AACTT C C GT T C C AGT AAT AAAT AC AGCT CA 



1561 T CT T GAAAGAAGAACT AT T CACT GT AT TT CAT T T T CTT T AT AT T G GAC C GAAGT CAT T AA 
| | M I I I I II I I I I I M I I I I I I I I I I I I M I I I I I I I M II M I I I I M I II II I I I I I 
1561 T CT T GAAAGAAGAACT AT T CACT GT AT TT CAT T T T CT TT AT AT T GGAC C GAAGT CATTAA 

1621 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GT ATTT GCACAGCACACTAT 

I M | I I I I I M II I I II I I I I I I I I M I I I I I M I I I II Ill N II i I 

1621 AACAAAAT GAAAC AT T T GC CAAAAC AAAAC AAAAAACT AT GT AT T T G CAC AGC ACACT AT 

1681 TAAAAT ATTAAGT GTAATT AT TT TAACACT CACAGCT ACAT AT GACATTTTAT GAGCT GT 

| | | M | | || | | | | I I I I I I I M II I I I I I M I I I I I I I I I I IN 

TAAAAT ATTAAGT GTAATT ATTTTAACACT CACAGCT ACAT AT GACAT TTTAT GAGCT GT 



1681 
1741 



T T AC G GCAT G GAAAGAAAAT CAGT GGGAAT T AAGAAAGC CT C GT C GT GAAAG C ACT T AAT 

| | | | | | | || | | | | | I I I II I I II I M II I II I I I I I I I I I I I I I N I II II INI 

1741 TT ACGGCAT GGAAAGAAAAT CAGT GGGAAT TAAGAAAG C CT CGT CGT GAAAGCACTTAAT 



1801 
1801 
1861 



TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

IIIIIMIIIMIMIIIIIMIIIIIIMIMIIINIIIIIIIIIMIIIIIMIMI 

TTTTTACAGTTAGCACTTCAACATAGCTCTT7\ACAACTTCCAGGATATTCACACAACACT 



TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

IIMIIIIIIIMIIIIIIIIIIIIIIMIIM ! M Mil! III!!!!!!!!!!! 

1861 TAGGCT " 



1140 
1200 
1200 
1260 
1260 
1320 
1320 
1380 
1380 
1440 
1440 
1500 
1500 
1560 
1560 
1620 
1620 
1680 
1680 
1740 
1740 
1800 
1800 
1860 
1860 
1920 



TAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 



1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAATAAGT CACT GTAAAACAGAACTTTTAAAT G 1980 

MM MIIIMIIMIMIIIIIIM II I I I I I I I M I I M I I I M I I 



1921 



AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 



1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 r i iiiiiii ii i ii i i:mi i mill 1 1 M 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 



2041 
2041 
2101 



TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

| | | | | | | | | | | I I I I I I I I II ! I I I , I I 1 I I I I I I I M I I 1 I I II I I I I 

TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 



TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

| | | | | | | | | | | | | | | I II I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 



1980 
2040 
2040 
2100 
2100 
2160 
2160 



2161 
2161 
2221 
2221 



TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

| | | | | | | | | | | | I I I II Illl I I I I I I I I I I I I I I I I I I I I I I I 

TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 



CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 

IIMIIIIIIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIMIIIIIIIMIIIIIII 

CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 



2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 

|| ; I | I I ! 1 I I I I ! I I I 1 I I I I I 1 I I I I I ' I I I 

2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 

| M I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I 

2341 CAAAGAGAAAT AGAAT GT T T GAAAGGCTAT C CCAAAAGACTT T T TT GAAT CT GT CAT T CA 

24 01 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 
| | M I I I I I I II I I I I I II I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I 
2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 



2280 
2280 
2340 
2340 
2400 
2400 
2460 
2460 



2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

| | | | I I I I I I I I I I I I M I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

| | | M I I I I I I I I I I II I I I I I I I I I I I M I I I II I I I I I I I I I I I I I II I I I I II I I I I 
2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 



2581 

77m i m 1 1 1 1 Tii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 liNililiiiiiiliiiiiililiii 

2581 — 



GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 



2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

| | | M | | | M I I I I M I I I I I I Ml I I I I I I I I I I I I I I I I I Ml I Ml I III I I I I I I I 
2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

I I I | | | I I I II I I I I I I I I Mill Ml I Mill I I I I I I II I I II 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 



2700 
2700 
2760 
2760 



2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

| | | | | | | || || II I I I I I II I I M M I I I M I I I I M I I I M I I I I I M I I II I I 

27 61 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 



2821 
2821 
2881 
2881 
2941 
2941 
3001 
3001 
3061 
3061 
3121 
3121 
3181 
3181 
3241 



ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

| | | | | | | | | I I I I I I I M M II I I I I M I I I M I I I I I I I I I I I I M I M I I I II 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

| | | | | | || | | | I I I I II I I I I II II II I I I I M I I M M I I I M I I I II I I 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 

I M I I I I I I I I I I I M I I II I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I M I I I I 

GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 
G G GT T G GAGGAAAC C CAT GG GGAC AGAT T C C CAT T CT T AGCCTAAC GT T C GT CAT T GC CT 

I | | | | | | || | | || I I I I I I I I I I I M I I I I I I II I I I M I M I I I M I I II I 

GG GT T GGAG GAAAC C CAT G GG GAC AG AT T C C CAT T CT T AGC CT AAC GTT C GT CAT T GC CT 

CGTCACATCAATGCA7\AAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 

| ;i | , I I I ;l I I ! I i I !! I I 1 M I I I I I I I Ml I 1 I I Ml I MINI 

CGT C AC AT CAAT GCAAAAGGT C C T GAT TT T GT T C C AGCAAAACAC AGT G CAAT GT T CT C A 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

| | | | M | | | | | | || I I II II I I I I I I I I I I I I M I I I I I I I I II I I M I 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

I | | | | | || II I II I I I I I I I I I I I M I I I I I I I I I I I M I I I M I I I 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 



TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 

I I I I | I | M | I M II I I M I I I I II I I I I M I I I I I I I I M I I I I I M I I I I I I I I M II 

3241 TTGTTTTCT GT CAAT ATT GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GG CC A 



3301 GAAAG AAAG AGC AAT AAT AAT T AAT T C AC AC AC CAT AT G GAT T CT AT T T AT AAAT C AC C C 

I M I M I I M M I I I I I I I I I I I M I M I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I 

3301 GAAAGAAAGAGCAATAATAATTAATT CACACACCATAT GGATT CTATTT ATAAAT CACCC 
3361 ACAAACT T GT T CT T T AAT T T CAT C CCAAT C ACT T T T T C AGAGGC CT GT TAT C AT AGAAGT 

I M I I I I I I M I M M I I I I I M I I I I I II I I I I I I I I I I M I I I M I I I I I I I I I I I I I 

3361 ACAAACT T GT T CT T T AATT T CAT C CCAAT CAC T T TT T C AGAGGC CT GT TAT CAT AGAAGT 

3421 CAT T T TAG ACT CT CAAT T T T AAAT T AAT TT T GAAT CACT AAT AT T T T CACAGT T TAT T AA 
| | | | | | | || | || | | | I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I M I I M 
3421 CAT T T T AGACT CT CAAT T T T AAAT T AAT TT T GAAT CACT AAT AT T T T CACAGT T TAT T AA 

34 81 TAT AT T T AAT T T CT AT T T AAAT T T T AGATT AT TT T TAT T AC CAT GT ACT GAAT T T T T AC A 

| | | | M II I I M I I I II I I I I I I M II I I I I M I I I I I I I II I I I I I II I M I II 

3481 TAT AT T T AAT T T CT AT T T AAAT T TT AGAT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

IIIIMIIIIIIMIMIIIIIIIIIIIIIIIMMIIIMIMIIIIIMIMIIIIM 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



3601 TGAAACTACACACT^AAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

| | | | | M I I I I I I I I I M I I I I M I M M I I I I I I I M M M II I I I M M M I I I M I I 
3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 



2880 
2880 
2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 
3180 
3240 
3240 
3300 
3300 
3360 
3360 
3420 
3420 
3480 
3480 
3540 
3540 
3600 
3600 
3660 
3660 



Qy 


3661 


TTTAAAAAAAATGTTT GATTCAAAACTTTAACATACT GATAAGTAAGAAACAATTATAAT 

1 , 1 M | I I 1 | I | 1 1 I 1 1 1 M II 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

TTTAAAAAAAAT GTTT GATT CAAAACTTT AACATACT GATAAGTAAGAAACAATT ATAAI 


j / tU 


Db 


3661 


"379 0 


Qy 


3721 


T T CTT TACAT ACT CAAAAC CAAGATAGAAAAAGGT GCTAT CGT T CAACTT CAAAACAT GT 

| | | | | | | I M I 1 1 1 I 1 1 II 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 

' ' ' _ _ _ , __ _ __ j — - Tk ^ jr** rn m i^i TV TV TV TV TV fTl f~* rp 

T T CT T TACAT ACT CAAAAC CAAGAT AGAAAAAG GT GCTAT C GT T CAACT T CAAAACAT GT 


^7 ft n 


Db 


3721 


^7 ft n 


Qy 


3781 


TT CCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACAT GGATGTTA 

I M | | | | | M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACAT GGAT GTTA 




Db 


3781 


Q Q A n 


Qy 


3841 


C AG CT C AAAAGAT T T AT AAAAGAT T T T AAC CT AT T T T CT C C CTT AT TAT C C ACT GCT AAT 

| M 1 1 1 1 1 M 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


qonn 


Db 


3841 


oonn 

oyuu 


Qy 


3901 


GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCTT ACAT AT G GC CAAAGGAAT AC A 

| | I | | | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

GT G GAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCTT ACAT AT G GC CAAAGGAAT AC A 


j y bu 


Db 


3901 


*3 O C f\ 

39 foU 


Qy 


3961 


GTTT ATAGCAAAACAT GGGTATGCT GT AGCT AACTTTATAAAAGT GTAAT ATAACAAT GT 

1 1 I 1 | 1 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 MINIUM! Illllll 

GT T TAT AG CAAAACAT GGGT AT GCT GT AGCT AACT T T AT AAAAGT GTAAT ATAACAAT GT 


a n o n 
4 uZ U 


Db 


3961 


4 \JZ U 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

I | | | || | || | || 1 II II M 1 1 1 M 1 1 1 1 II 1 1 II 1 1 II M IMMMIMM 

AAAAAAT TAT AT AT CT G GGAGGAT TT T T T GGTT GC CT AAAGT G G CT AT AGT TACT GAT T T 


a n o n 
4 (Jo U 


Db 


4021 


a r\ o n 
4 UoU 


Qy 


4081 


T T TAT TAT GT AAGC AAAAC C AAT AAAAAT T T AAGT T T TTT TAACAACT AC CT T AT T T T T C 

1 1 I I | M 1 1 1 1 1 II 1 1 1 1 M 1 II M 1 1 II 1 1 II M II 1 1 II II II 1 M M 1 II 1 1 

T T TAT TAT GTAAG CAAAAC CAAT AAAAAT T T AAGT T T T TT TAACAACT AC CT T AT TT TT C 


a t a n 
4 14 U 


Db 


4081 


a i a n 
4 14 U 


Qy 


4141 


ACT GT ACAG ACACT AAT T CAT TAAAT AC TAATT GATT GTTT AAAAGAAAT AT AAAT GT GA 

| | m 1 1 1 M II II 1 1 M 1 II 1 1 1 II 1 II 1 M II II 1 1 1 1 M 1 1 II 1 II M 1 1 II M M II 

AC T GT AC AGAC ACT AAT T C ATT AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 


a o n n 
4Z U U 


Db 


4141 


d 9 n 0 


Qy 


4201 


CAAGT GGAC AT T AT TT AT GT TAAAT AT ACAAT TAT CAAGCAAGT AT GAAGT TAT T CAAT T 

M 1 | II II 1 1 1 II M 1 1 1 1 M 1 1 1 1 1 1 II 1 1 II II II M II 1 1 1 II II II 1 1 1 1 1 

CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 


4 9 fin 


Db 


4201 


4260 


Qy 


4261 


AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 

1 I I I I 1 II 1 1 II II II II II 1 1 II 1 M 1 

AAAAT GC CACATTT CT GGT CT CT GGGAAAAAAAAAAAAAAA 4301 




Db 


4261 





RESULT 2 

E07650 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 



E07650 4301 bp RNA linear 

cDNA encoding endothelin receptor , ETB-receptor . 
E07650 

E07650.1 GI:2175785 
JP 1994157595-A/2. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 4301) 



PAT 29-SEP-1997 



Craniata ; Vertebrata ; Euteleos tomi ; 
Catarrhini; Hominidae; Homo. 



AUTHORS 

TITLE 

JOURNAL 

COMMENT 



Imura,H., Nakao,I. and Nakanishi , S . 
HUMAN ENDOTHELIN RECEPTOR 
Patent: JP 1994157595-A 2 03-JUN-1994; 
SHIONOGI & CO LTD 
OS 
PN 
PD 
PF 
PI 



Homo sapiens (human) 
JP 1994157595-A/2 
03-JUN-1994 

12-JUL-1991 JP 1991172828 

IMURA HIROO, NAKAO ICHIKAZU, NAKANISHI SHIGETADA PC 



C07K13/00,C12N5/10,C12N15/12,C12P21/02, (C12N5/10, C12R1 : 91) , PC 

(C12P21/02, 

PC C12R1:91); 

strandedness : Double; 
topology : Linear; 
hypothetical : No; 
anti-sense: Noj 



CC 
CC 
CC 
CC 
FH 
FH 
FT 
FT 
FT 
FT 

FT 
FT 

3 ! UTR 

FT polyA_signal 

FT polyA_signal 

FT polyA_signal 

FT polyA_signal 

FT polyA_signal 



Key 



source 



1. .237 
CDS 



Location/Qualifiers 
1. .4301 

/organism^ Homo sapiens 1 
/tissue_type=' placenta 1 
/clone= , pHETBR31 and pHTBR34 1 FT 



5'UTR 



238. .1566 

/product^ endothelin receptor , ETB-receptor ' FT 
1567. .4301 



4258. 
3638, 
3134, 
2594, 
1689, 



,4263 
,3643 
.3139 
.2599 
. 1694, 



FEATURES 

source 



ORIGIN 



Location/Qualifiers 
1. .4301 

/organism= M Homo sapiens" 
/mol_type=" genomic RNA" 
/db xref="taxon:9606" 



Query Match 100.0%; Score 4301; DB 6; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 4301; Conservative 0; Mismatches 0; 



Length 4301; 
Indels 0; 



Gaps 



0; 



Qy 


1 


GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

1 1 1 1 1 M | I | | I 1 M II 1 1 1 M 1 1 1 1 1 M 1 M M 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 M M 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 


60 


Db 


1 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

| | | | | | | || | | | I I I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 
AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


61 


120 


Qy 


121 


AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACAT CT GA 

1 | | | | | | | | | | | | || 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 II 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 

AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACAT CT GA 


180 


Db 


121 


180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

| | | | | | | | | | I 1 II 1 I 1 1 1 1 II 1 1 1 M 1 1 II M 1 1 M 1 1 1 1 1 

AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


181 


240 



241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 3 0 0 



421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

481 C C AC G CAC CAT CTCCCCTCCCCCGTGC C AAG GAC C CAT C GAGAT C AAGGAGACT T T CAAA 540 

| | | | | | 1 | I I M I I I I I I I I I I I I 1 I I I I I I I I I I M II I I M I 

481 CCACGCAC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAG G AGACT T T CAAA 54 0 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

| M I I I I II II I I M I I I I I I I I I I I I M M I I I I M I I I I I I I I I M II I I I I I 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C G C C 660 

I I I I I || I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I II I 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

M I I M I I I I I I I I I I I I M I II I I I II I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I 

661 AG CT T GGC T CT GGGAGAC CT GCT G CAC AT C GT CAT T GAC AT C C CT AT CAAT GT CT ACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

| | | | M | | | | | | | | I I II I I I I I I II I I I I I I I I I I M M I I I I I I I M I I I I I I I I I I I 
721 CT GCT GGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCTGGTGCCTTT CAT ACAG 780 



781 
781 
841 



AAAGCCT CCGT GGGAAT CACTGT GCT GAGT CT AT GT GCT CT GAGTATT GACAGAT AT C GA 840 

I I M I I I I I I M II I I I I I I I I I I I I I I I I I I I I M I M I I II I I II I I I I M I I I I I I I 

AAAGCCT CC GT GGGAAT CACTGT GCT GAGT CT AT GT GCT CT GAGTATT GACAGAT AT C GA 



GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 
| | | M I I I M I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I II I I M I I I I I 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 



901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 
| | | | | M | | | | | I I M I I I I I I I M II M I I I I I I I I I I I I M I I M I I I I I I I I I I I I I 
901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

961 AT AAT T AC GAT GGACT ACAAAGGAAGT T AT CT G C GAAT CTGCTTGCTT CAT C C C GT T C AG 

I I | | | | I | M I I I I I I I M II I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 



840 
900 
900 
960 
960 
1020 
1020 
1080 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 
| | | | | | | | M I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGC7UVAAGATTGGTGGCTGTTCAGTTTCTAT 1080 



1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

| | | | | | | | | | | I I I I I I M I I I I I I I | I I I I I I I I I I I I I I I 1 I I I I M I I I I I 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 



1140 
1140 
1200 



1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

|| | | | | | | | | | I I I I I M I I Mil I I I I I I I I I I I I I I I I I I 

1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 



1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

| | || | | | | | | | || | | I II I I I I II M I I I I I M I M 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAG7UVTGATCCCAATAGATGTGAACTTTTG 

M I I I II I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I M I I I I I I I I M I I M I 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 
1321 



1260 
1260 
1320 
1320 
1380 



AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

| | | M | | || | | | I I I IIIIIIIMIIIIIIMMMIIII IIIIMIMI 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

| | || | | M || | | | I M I I I I I I II II I I I I I M I I I M I I I I I I M I M I I I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| | | | | | | | | | | | | I I I I I I I I I I I I I M I I I I M I M I I I I M I I M II I I I I I I I I I I I 
1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 



1501 AAGT T CAAAGCT AAT GAT CAC GGAT AT G AC AACTT C C GT T C CAGT AAT AAAT AC AGCT C A 

1 1 1 1 1 1 1 !i i ; 1 1 1 m 1 1 1 1 1 i 1 1 1 m 1 1 : 1 1 1 1 : i : 1 1 ! i . 1 1 1 1 1 i N 1 1 ii 1 1 1 1 1 1 

1501 AAGTT CAAAGCTAAT GAT CACGGAT AT GACAACTT CC GT T CCAGT AAT AAAT ACAGCT CA 
1561 T CT T GAAAGAAGAACT ATT C ACT GT AT T T CAT T TT CT T TAT AT T GGACC GAAGT CAT T AA 

MINI | | | I I M I I II I I II I I II I M I I I I M I I I I I I I I I M M I I I I M M 

1561 T CT T GAAAGAAGAACT ATT C ACT GT AT TT CAT T T T CT T TAT AT T GGACC GAAGT CAT T AA 



1440 
1500 
1500 
1560 
1560 
1620 
1620 



1621 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACT AT GTATTT GCACAGCACACT AT 1680 

| | | | | | | | | | | | I I I M I II I I I I I I I I I I I I I I I I I I I I M I I 

1621 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACT AT GTATTT GCACAGCACACT AT 168 0 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | | M I I I M I I I I I I I I I I I I I I I I I I I M M I I I N I I I M I I I I I I M I I I I I I I M 
1681 T AAAAT AT T AAGT GT AAT T ATT T T AAC ACT C AC AGCT ACAT AT GAC AT T T TAT GAGC T GT 



1741 



1740 
1800 



TT AC GGCAT GGAAAGAAAAT CAGT GGGAAT TAAGAAAGCCT CGT C GT GAAAGCACTTAAT 

M I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I II I I 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 



1801 T T T T T AC AGT T AGCACT T C AACAT AGCT CT T AACAAC T T C CAGGAT AT T C AC ACAAC ACT 
|| | | | | | | | | | | | I I II I II II I I I I I I I I I I M I I M I I I I I I I M I I I I I I I I M M I 
1801 T T T T T AC AGT TAG C ACT T CAACAT AGCT CTT AACAACT T C C AG GAT AT T CAC AC AAC ACT 



1860 
1860 



18 61 TAGGCTTT^AAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

I I ! 1 I I I I I I I I I I 1 I I I I I I 1 I I I i I I I I I I 1 I 1 I 1 I t 1 I I I 1 1 ■ I I I t I t I I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 



1921 AAT C AAT G GGACT C T GAT AT AAAGGAAGAAT AAGT CACT GT AAAAC AGAACT T T T AAAT G 1980 



Ml I II I I I I M I I I I I I M M Mill MIIMIMIIIMI 

1921 AAT CAAT GGGACT CT GAT ATAAAGGAAGAATAAGT CACT GTAAAACAGAACTTTTAAAT G 



1981 
1981 
2041 
2041 
2101 
2101 
2161 



AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT CC T T T AAAACAACT T T T CAAT T AAT AT 

IIIIUI Illlllllll MMI MM MIIIMM 

AAGCT T AAAT TACT CAAT TT AAAAT T T T AAAAT C CT T T AAAACAAC T T T T CAAT T AAT AT 
TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT G C AAAT GAGAG AGC AGT T T AGT T GT T GC AT 

I | | || | | || | | | || M I I I I I II II II I I I I M II II II I I I M II II I I M II I I I I M 

TAT CACACTATTAT CAGATT GT AAT T AGAT GC AAAT GAGAGAGCAGTTTAGTT GTT GCAT 



T T T T C GGACACT GGAAAC AT T T AAAT GAT CAGGAGGGAGT AAC AGAAAGAGCAAGGCT GT 

| M | || I I I I I II I I I I I M M M I I I I I I I I I I I I I M M II 

TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

T T T T GAAAAT CAT T AC ACT TT CACT AGAAGC C CAAAC CT CAGCAT T CT GCAAT AT GT AAC 

1 1 ii i ii 1 1 1 m 1 1 1 1 1 ii 1 1 ii 1 1 ii 1 1 1 ii m i ii i ii ii 



1980 
2040 
2040 
2100 
2100 
2160 
2160 
2220 



2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 



2221 CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACATGT GCCAGCT GAATT TAAAA 

I I I | M I I M I I I I I I I I M M I I M I II I I I M M I II I I II I M M IIIMM 

2221 CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAATTTAAAA 
22 81 T ATAAT ACTT T T AAAAAGAAAAT TAT T AC AT C CT TT AC AT T C AGTT AAGAT CAAAC CT C A 

|| II I I I I I I I I M M I M I I I II M I M M I I II I I I M I II I I I I 

2281 TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT TAAGAT CAAAC CT CA 



2341 

2341 

2401 

2401 

2461 

2461 

2521 

2521 

2581 

2581 

2641 

2641 

2701 



CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 

M M I I II I II I I I I I M II M I I M I I M I I M II M I I I I I 

CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 
C AT ACC CT GT GAAGACAAT ACT AT CT ACAATT T T T T CAGGATT AT T AAAAT CTTCTTTTT 

M | I | | | | | I I I I I I I I II I II I I I M I I I II I II I II II M I I M M M M II I 

CAT ACC CT GT GAAGACAAT ACT AT CT ACAAT T T T T T C AGGAT TAT T AAAAT CTTCTTTTT 
TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT ACTTACCTACATACA 

I M I | | M I I M I I II I M I I I M I I I I I I M II I I I M I M I I I M M M I 

T CACT AT CGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT ACTTACCTACATACA 



CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

MINI I II I I I M II I I I I II II I I II II II I I M I II I MMMI 

CT GCAT GTAGAT GATTAAAT GAGGGCAGGC CCT GT GCT CATAGCTTT ACGAT GGAGAGAT 
G C C AGT GAC CT C AT AAT AAAGACT GT GAACT G C CT GGT GCAGT GT C C AC AT GACAAAGG G 

I | | | | || | | M I I II I I I I M I I I M I II I II II I M II I M I I I I I I M I M 

GC C AGT GAC CT CAT AAT AAAGACT GT GAAC TGCCTGGTG CAGT GT C C AC AT GACAAAGGG 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

| || | | | | | | | | || | | I II M II I I II I II I II I I M II I I M II I I I I I I I I I M M I I I 
GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

GCT AT AGT T AAAAT AC TAT T TT T CAAAAT CAT ACAGAT T AGT AC AT T T AACAG CT AC CT G 

I I II II II I I I I II I I M II II I I II II II I II II I M I I I I I I IN \_ M II 1 1 1 M H M 



2280 

2280 

2340 

2340 

2400 

2400 

2460 

2460 

2520 

2520 

2580 

2580 

2640 

2640 

2700 

2700 

2760 



2701 GCT AT AGT T AAAAT ACT AT T T T T CAAAAT CAT ACAGAT T AGT AC AT T T AAC AG CT AC CT G 2760 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

1 I I I 1 1 I I M I I I M I I I 1 M I I M I I I 1 I I M M I I I ! I M I M I I I I I I II I 



TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 



2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT" 

I I I I I M I I M I I I I I I I I M II I I I I I I M I I I I I M I I M I I I I I 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 



2821 
2881 
2881 
2941 
2941 
3001 
3001 



AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

I I | | | | I I I I M I I I I I I I I I I I I M I I M I I M I I I I I I I M M I II I 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 

I | | | | | || | | | | I I M I I I I I I M I I I II I I I I I I I I I I I I I I I M II I 

GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 
G G GT T GGAGGAAAC C CAT GGG GAC AGAT T C C CAT T CT T AG C CTAACGT T C GT CAT T GC CT 

I | | | | | M I I I I I I I M I I I I I I I I I M II I II I I I I I I I I M I M M II I II I 

GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 



3061 C GT C AC AT CAAT G CAAAAGGT C CT GAT T T T GT T C C AGCAAAAC ACAGT GCAAT GT T CT C A 

I I M I I I I I M I M I I I I II I I I I I I I I I I M I II I I I I I I I I I I I I I M 

3061 CGT C ACAT CAAT GCAAAAG GT C CT GAT T T T GT T C CAGCAAAACAC AGT GCAAT GT T CT CA 
312 1 GAGT GACT T T C GAAAT AAAT T GGG C C C AAGAGCT T T AACT C GGT CT T AAAAT AT GC C CAA 

I | | | | | | | | | | I I II II I I II I I I M I I I M M II I I I I I I M I M I I I M I I I I 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

Ml M II I I I I I I I M I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I M 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

3241 TTGTTTTCTGT CAAT AT T G AAT GT GAT G GT AC AGT AAAC C AAAAC C C AAC AAT GT GGC C A 
| | | | | | | | | | M I II I I I I M I I I I I I M I I I I I I I I I I I I I I I M I I M I I I I I I I I I I 
3241 T T GTT T T CT GT CAAT ATT GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAACAAT GT GGC CA 

3301 GAAAGAAAGAGCAATAATAATTAATT CACAC ACCAT AT GGATT CT ATTT ATAAAT CACCC 

I I I | M M I I I I I I M I I I I I I M I I I I M I I I I I I I I I I I I I I M I I I I I M 

3301 GAAAGAAAGAGCAATAATAATTAATT CACACACCAT AT GGATT CT ATTT ATAAAT CACCC 

3361 ACAAACT T GT T CT TT AAT TT CAT C CCAAT C ACTT T T T CAG AGGC CT GT TAT C AT AGAAGT 

| | | M || | | I II I I I I I I I I I M I I I I I I I M I I I I I I I 1 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT CACTTT TT C AGAG G C CT GT TAT CAT AGAAGT 

3421 CAT T TT AGACT CT CAAT T T T AAAT T AAT TT T GAAT C ACT AAT AT T T T C ACAGT T TAT T AA 

I I I I M I I II M I M I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3421 CAT T T TAG ACT CT CAAT T T T AAAT T AAT T T T GAAT C ACT AAT AT T T T C ACAGT T TAT T AA 

3481 TAT AT T T AAT T T CT AT T T AAAT T T TAG AT TAT T T TT AT T ACCAT GT ACT GAAT T T T T AC A 
| | | | | | | | | | I | I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 
3481 TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T T TAT T ACCAT GT ACT GAAT TT T T ACA 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAT^ATTT 

| | | | | | | | || | | | | | | I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



3601 T GAAACT ACACACAAAAAGCAT ACTT GCATTATTTAT AAT AAAATTGCATT CAGTGGCTT 

| | | | | | 1 | | | I I M II I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I 

3601 T GAAACT ACACACAAAAAGCAT ACT T G CAT TAT T TAT AAT AAAAT T G CAT T CAGT GGCT T 



2880 
2880 
2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 
3180 
3240 
3240 
3300 
3300 
3360 
3360 
3420 
3420 
3480 
3480 
3540 
3540 
3600 
3600 
3660 
3660 



Qy 


3661 


T T T AAAAAAAAT GT T T GAT T C AAAACT T T AAC AT ACT G AT AAGT AAGAAAC AAT T AT AAT 

* i ■■ i ■ i i i i i i i i i i i i i 1 1 1 1 1 1 1 1 1 1 1 1 

M I | | 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 M 1 

T T T AAAAAAAAT GTT T GAT T CAAAACT T TAAC ATACT GAT AAGT AAGAAACAAT TAT AAT 


3720 


Db 


3661 


3720 


Qy 


3721 


T T CT T T AC AT ACT CAAAAC CAAG AT AGAAAAAGGT GCT AT C GT T C AACT T CAAAAC AT GT 

t ■ i i ■ l l i l l I l t 1 L 1 1 1 | 1 1 1 1 1 1 1 

1 1 1 I I 1 1 I | M | I I | I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 

TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 


3780 


Db 


3721 


3780 


Qy 


3781 


T T CCT AGT AT T AAGGACT T T AAT AT AG C AAC AGACAAAAT T ATT GT TAAC AT GGAT GT T A 

I | | | | | | | | | | | | I | I 1 1 M 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 


3840 


Db 


3781 


3840 


Qy 


3841 


CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

M | | | | | | | I I 1 I 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


3900 


Db 


3841 


3900 


Qy 


3901 


GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

1 | I I M I 1 1 1 1 1 1 1 II 1 1 I 1 1 1 1 1 II II 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 

GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 


3960 


Db 


3901 


3960 


Qy 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

| M | | | | | || | I 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 1 M 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 


4020 


Db 


3961 


4020 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

| | | | | | | | | | | | | | | I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


4080 


Db 


4021 


4080 


Qy 


4081 


TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

| | | | || || | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M M 

TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 


4140 


Db 


4081 


4140 


Qy 


4141 


ACT GT AC AGAC ACT AAT T C ATT AAAT ACT AATT GAT T GT T T AAAAGAAAT AT AAAT GT GA 

i i i i i i i i ■ i | | | | 1 1 1 1 1 

1 | | | M 1 1 1 1 i 1 1 1 1 1 M 1 1 1 1 M 1 1 1 M 1 1 1 M 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

ACT GT AC AGAC ACT AAT T CAT T AAAT ACT AATT GAT T GT T T AAAAGAAAT AT AAAT GT GA 


4200 


Db 


4141 


4200 




4201 


CAAGT GGACATT ATTTAT GTTAAAT AT ACAATTAT CAAGCAAGTAT GAAGT TAT T C AATT 

I M 1 II 1 1 1 1 1 1 1 1 M 1 M 1 1 1 M 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 M 

CAAGT GGACAT TAT T TAT GT T AAAT AT ACAAT TAT CAAGCAAGTAT GAAGT TAT T CAAT T 


4260 


Db 


4201 


4260 


Qy 


4261 


AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 

I I M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 M 
AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 




Db 


4261 
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Location/Qualifiers 

1. .4286 

/organism= n Homo sapiens" 

/mol_type="mRNA n 

/db_xref="taxon: 9606" 

1. .4286 

/gene="ETR" 

238. .1566 

/gene="ETR" 

/codon__start=l 

/product="endothelin receptor" 
/protein_id="BAA14398 . 1" 

/db xref="GI:219652" rtTT ^r. 
/trans la tion="MQPPPSLCGRALVALVLACGLSRIWGEERGFPPDRATPLLQTAE 

IMTPPTKTLWPKGSNASLARSLAPAEVPKGDRTAGSPPRTISPPPCQGPIEIKETFKY 

INTWSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIVIDIPINVY 

KLLAEDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWT 

AVEIVLIWWSWLAVPEAIGFDIITMDYKGSYLRICLLHPVQKTAFMQFYKTAKDWW 

t* ™„™,™t n t a -r rr t\ tt tt v^p t .mt r f.mt . R KK S CMO T ALN DH L KO RRE VAKT VFC L VLVFAL 



polyA__site 
ORIGIN 



TVLIWWSVVLAVPEAIGFDIITMDYK^bll.Kiuiji.nrv^rvi^rrivi. nmmw 
L F S F Y F C L P LAI T AF F YT LMT C EML RKK S GMQ I ALN DH L KQ RRE VAKT VFC L VLVFAL 
CWLPLHLSRILKLTLYNQNDPNRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQSFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 



4286 
/gene="ETR" 



Query Match 99.7%; Score 4286; DB 9; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 4286; Conservative 0; Mismatches 0; Indels 



0 ; Gap s 



0; 



QY 
Db 

Qy 

Db 

Qy 

Db 



GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

I | | 1 | | | | | 1 I 1 | I I I I I I I t I I I I t I I 1 t I 1 I I I I I I I I 1 1 I I I I I 1 1 1 I I I t 1 1 1 k 1 1 
GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 



60 



60 



120 



61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

I | | | | | | | M I I I I I I I I I I I M I I I I M I I M I M I I I II I I I I 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 



61 



180 



121 AG GAT CAACAC AGT G GCT GAAC AC T G G GAAG GAACT GGT ACT T GGAGT CT G GAC AT CT GA 

I | | | | | | | I I I I I I I I I I I i I 1 M I I I I I 1 I M I M i I I I I I I I I I I II 

121 AGGAT CAAC ACAGT G GCT GAACAC T G G GAAGGAACT G GT ACT T GGAGT CT G GAC AT CT GA 180 



181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M i mill m [u i m m m 

181 



AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 



241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 



361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 
421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

1 1 1 1 iiiMiiiiiiiiiiiiiiiiiiMiiiiiM mum 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 



421 
481 
481 



CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

| | | | | | | | | | III I I I I I II I I MMIII I I I I I I I 1 1 ' n 

CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 



541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I Ml | | M I I I I I I I I I I I I M I I I I I I I Ml I I I I I I I I I Ml I I I I III I I 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

MMIII I I I II II I I I I I M I I I I I I I I I I I M I I II I M I I II II Ml 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

| N i I II I I I M I I II I I I I I ! I I I I I I I M I M I I I II I I II II I I II I M I I I 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I M II III I I Ml I M I I I M I I I M II I M II I II II M III I I I Ml M M II 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 
7 8 1 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 8 4 0 

I I M I I II I I II I I I M II II I M II I I I II I I I I I M I Ml I Nil \_\_ I M M M I^M 

781 



AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 84 0 



8 41 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 



| | | M I II I I I I M I I II I III M I II I M I I I M I Ml 

TGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGi" " 
TGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGj 

I | | M I I II I II I M I Ml I I II I I I I I I M I I I II I M I II I I I I I II M I I I I I 

'TGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTG 
'AATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTC. 

I I M I II II I I I II M Ml I II I I I M II I II I I I I I I Ml II I I M I III I II I M I 



961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 

| | | | M | | M | | | I I I I I I M I I I I I II I I I I I M I M I I I I M I I I I I I I I I 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 

1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 

| | | M I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 

12 01 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

| | | | | | | M | M I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 

M I I I I II I I I I I I I I I I I M I I I I I M II II I I I I I M M I I I I I I I M I I I I I I I I I I 
1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

| | | I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| || | | | | | | | | || I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 

| | | | | | | | | | | | | | || I I I I I I I I I I I I I I I I I I I I I M M I II I I I I I I I I I M M I I I 
1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

| | | | | || | | | | I I I I I I I I I I I I II I II I I II I I I I I I I I M I I I I I I I I I I I I I 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

1621 AACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACT AT GT ATTT GCACAGCACACT AT 1680 

| | | | I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I II 
1621 AACAAAAT G AAAC AT T T G C C AAAAC AAAAC AAAAAAC TAT GT AT T T G C AC AGC AC AC T AT 1680 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I II I I I I I I I I I 
1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

| | | | | | | | | | | I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I 
1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

18 01 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

| | | | | M I II I I I I I II M I I M I I M I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I 
1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

M | | I M I I I I I I II I I I II I I II I I I I M I I I I I I I M I I I I I M I I I I I I I I I M I I I 



1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 
1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

I | | | M I I I I I I I I M I II I I I I I I I I I I M M I I I I I I M I I I I I I I I I I I I I M I I I I 

1921 AAT CAAT GGGACTCT GAT AT AAAGGAAGAAT AAGTCACT GTAAAAC AGAACTT T TAAATG 1980 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 2040 

| | | | | | | | | | | | | | M I I I I I I I I I I I I I M M I I I I I I I I I I I I I 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 2040 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

| | | | | | | | | | | | | | M I I I I I I I II I I I I I I I I I I I MINIMUM! 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

MM II I I I I I I I I I I I I I I I I I I I I I I I I M I 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

| I I I I M I I I I I I I I I I I I I M I I M II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

2221 C AAC AT GT C AC AAAC AAGCAGC AT GT AAC AGACT GGC AC AT GT GC C AGCT GAAT T T AAAA 2280 

| M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I 

2221 C AACAT GT CACAAAC AAGCAGC AT GT AACAGACT GGC AC AT GT GC C AGCT GAAT T T AAAA 2280 

2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 2340 

| | | | | | M I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I M I M I I I I I I I I I I I I I I 
2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 2340 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

| | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2 460 

| | | | | | M I M I I I I I I I I I II I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

| | | | | | M I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

| | | | | | | M I I I I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 

| | I | | | M I I I I I I I I I M I I I M I I I I I II I I I I I I M I I I I I I I M I M I I I I I M I I 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 

| | | | | | M II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 2760 

M M I I I I M I I I I M I II I I I II I I I I M M I I I I I I I I I I I I M I I I I I I I I I I I I I I 
2701 G CT AT AGT T AAAAT ACT ATTT T T C AAAAT C AT ACAGAT T AGT AC AT T T AACAG C T AC CT G 2760 



2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

MINIMI I MIIMMIIIMMMIIIIM MIIIMIIM 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 
2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAA7\ACTGCTTTTTGAGACCGTAAGAACCTCTT 

1 I I I | I 1 I I I I I I I 1 Mill M 1 I M I I I 1 I I I I I I I I M M I I M I 

AC AT GGTGCTTTTCTTT CAT CT AGAG G CAAAACT GC T T T T T GAGAC C GT AAGAACCT CT T 



2821 
2881 
2881 
2941 
2941 



AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

M M I M I I M II I I I M I I I I I I I II I I I I I I I I I I I I M I I I I M I I I I M I 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 

Ml, III, IMIMIMIII MINIMI MINIMI 

GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAGGT G 



3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

Ml 1 I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I 1 I I I I I 1 I 1 I I I 1 I I I I 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

3061 C GT CACAT CAAT GCAAAAGGT C CT GAT T T T GT T C CAGCAAAACACAGT GCAAT GT T CT C A 

|| M II I II I M I I I I M M II I I I I I M I M I M I II M I II I I M II I M I I 

3061 C GT CACAT CAAT GCAAAAGGT C CT GAT T T T GT T C CAG CAAAACAC AGT GCAAT GT T CT CA 

3121 GAGT GACT T T C GAAAT AAAT T GG G CC CAAGAG CT T T AACT C GGT CT T AAAAT AT GC C CAA 

| | | | | | | | I I I I I II II I I I I II I I I I I I I I I I I I I M M II I I M I II 

3121 GAGT GACT T T C GAAAT AAAT T GGG C C CAAGAG CT T T AACT C GGT CT TAAAAT AT G CC CAA 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGG7WVTAAGCTAGTAATG 

|| | | N || I I II I I I M I I I I I I M I I I I M I M II M I I I M I I I M I I I I I I I I M I I 
3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAACAAT GT GGC CA 

| N I I I I I I I M I I II M I M I I M I M I I I I I I I I I I I I M I I I M I I I I I I I 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAACAAT GT GGC CA 

3301 GAAAGAAAG AGC AAT AAT AAT T AAT T C AC AC AC C AT AT GGAT T CT AT T TAT AAAT C AC C C 

I | N | | | | | || I I I I I I I N I I I I I I I I I II M I I II I I I M I I I I M I II M I II I I I I 

3301 GAAAGAAAGAGC AAT AAT AAT T AAT T C AC AC AC CAT AT GGAT T CT AT T TAT AAAT C AC C C 

3361 AC AAACT T GT T CT T T AATT T CAT C C CAAT C ACT T T T T CAGAG GC CT GT T AT C AT AGAAGT 

| | | | | I II I II I I I I I II II I I I I II I I I I I M I I I I M I 

3361 ACAAACT T GT T CT T T AATT T CAT C C CAAT C ACT T T T T CAGAG G CCT GT TAT CAT AGAAGT 

3421 CAT T T T AGACT CT CAAT T T T AAAT T AAT T T T GAAT C ACT AAT AT T T T C AC AGT T TAT T AA 

Ml I II I II II I I I II II II I I M I I I I I I I I I I I I M I I I I II II I M I I I I 

3421 CAT T T T AGACT C T CAAT T T T AAAT T AAT T T T GAAT C ACT AAT AT T T T C AC AGT T TAT T AA 

34 81 TAT AT T T AAT T T CT AT T TAAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT T T TT ACA 

| I | | | 1 1 | | M M 11 I II I I 1 I I I I I I I I 1 I I I I I I I M M I I I 1 I I II I I I I I 

34 81 TAT AT T T AAT T T CT AT T TAAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT TT T T AC A 



3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

I | || || N I I I I I I M I I I I I I I I II I II I M I I I I I I I I I I I I I I MM 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



2820 
2820 
2880 
2880 
2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 
3180 
3240 
3240 
3300 
3300 
3360 
3360 
3420 
3420 
3480 
3480 
3540 
3540 
3600 
3600 



Qy 3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCACT 3660 

Db 3601 TGAAACTA 



.CACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 3660 



Ov 3661 TT T AAAAAAAAT GTTT GATT CAAAACT TT AACATACT GATAAGT AAGAAACAATTATAAT 

Db 3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 

nv 3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

3721 TTCTTTACA | | | | | | I II I I I I I I I I I I I I I I I I I I I 

Db 3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

Ov 37 81 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

yy | M | | | | | | || | | | | | | I I 1 I I I I I I II I I I I I I I I II I 1 1 I I I 1 

Db 3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

Ov 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

yY | | | | | | | || | | | | | I I I I I II I I I I I I I M I I II I I I I I I I I I I I I I I I I I II I I 

Db 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

Ov 3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

yY I M I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 



Db 3901 



SlGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 



4080 
4080 
4140 



Ov 4021 aaaaaattatatatctgggaggattttttggttgcctaaagtggctatagttactgattt 

QY i M m I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M II I I I II I I I I I I I I 

Db 4021 aaaaaattatatatctgggaggattttttggttgcctaaagtggctatagttactgattt 

ov 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

Y | | II II MM II II I M II I MM I III IIMMMMMIMMIM 

^ 4081 .,.,o^,»n.^A7lArT-TTiS!iOTTTTTTTaAr.AACTACCTTATTTTTC 4140 

4200 
4200 
4260 
4260 



Db 4141 



TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 

I M | | | | | | | M M I II I 1 I I 1 I I 1 I I I I I I I I I I I I I I I 1 I I I I I i I I I I I I 

ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 

Ov 4201 CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 

yy | | | M || | | | | | | || I I II I I I I I I II M II I M II II I I I II M II I I M I 

Db 4201 C AAGT GGAC AT T ATTT AT GT T AAAT AT ACAATT AT C AAGC AAGT AT GAAGT T ATT CAAT T 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

I II II II I II I M I II M M I II I I I 

Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 
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AX548828 4286 bp DNA 

Sequence 113 from Patent WO02061087. 
AX548828 

AX548828.1 GI:25813723 
Homo sapiens (human) 



linear PAT 26-NOV-2002 



ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS Burmer,G.C, Roush,C.L. and Brown, J . P . 

TITLE Antigenic peptides, such as for G protein-coupled receptors 

(GPCRs), antibodies thereto, and systems for identifying such 
antigenic peptides 
JOURNAL Patent: WO 02061087-A 113 08-AUG-2002; 
Lifespan Biosciences, Inc. (US) 
FEATURES Location/Qualifiers 
source 1 . .4286 

/organism="Homo sapiens" 
/mol_type= "una s signed DNA" 
/db_xref="taxon: 9606" 

ORIGIN 

Query Match 99.6%; Score 4284,4; DB 6; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

| | | M I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I M I I I M 

Db 1 GAG ACAT TCCGGTGGGG GACT CT GGC C AGC C C GAGCAAC GT GGAT C C T GAGAG C ACT C C C 60 

Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 12 0 

| 1 | | 1 | I M I I I I I I II I I I I I I II I II I I I I I I I I I M I II I I I I I I I I I I I I I 

Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AG GAT C AAC ACAGT G GCT GAAC ACT G GGAAG GAACT G GT ACT T GGAGT CT GGACAT CT GA 180 

| | | | | I I I I M M I I M I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I 

Db 121 AGGAT CAACACAGT GGCT GAACACTGGGAAGGAACT GGT ACT T GGAGT CT GGACAT CT GA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

| I I I I I I I I I I II I I I I II I I I I M I I I I I II I I I I I I I I I I I 

Db 181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

| | | | | | | | M | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCC7^AGGGTTCCAACGCCAGT 42 0 

| | | | | | | | | | | | I I || I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I M M i I 
Db 361 AC C GC AGAGAT AAT GAC GC C AC CC ACT AAGAC CT T AT GGC C CAAG G GT T C C AAC G C C AGT 42 0 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

| | | | | | | M I I I I I I II I I I I I I I I I I I 1 I I I I I I M I I I II I I 

Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGC AGGAT CTCCG 4 80 

Q y 4 81 C C AC GC ACC AT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGACT T T CAAA 540 

II I I I I I I I I I I I II I I I I I I I 1 M I I I I I I I I I I 

Db 4 81 C C AC GC ACC AT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGAC T T T CAAA 540 



541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 



I I I I I I | M I I I M I I I M I I I I M I I M I I I I I I I I M I M I M I I I I I I I M I I I I 

ACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCA 
' T T CT GAGAAT TAT CT ACAAGAACAAGT G CAT G C GAAAC GGT C C CAAT AT C T T GAT CG 1 

'llll I I II II I I I II I I I I I I I M I I M I II I M I II I I I II II I I II I I I 



601 CT T CT GAGAAT TAT CT AC AAGAACAAGT G CAT G C GAAAC GGT C C CAAT AT CT T GAT CG C C 660 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

| | | | | | | | | | I I I I I I I M I I I M I I I I I I I I M I I M 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 78 0 

| | | | | | | | | || | M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 CT GCT GGCAGAG GACT G GC CAT T T G GAG CT GAGAT GT GT AAG CTGGTGCCTTT CAT AC AG 7 80 

781 AAAGCCT CCGT GGGAAT CACT GT GCT GAGT CT AT GT GCTCT GAGTATT GACAGATAT C GA 840 

| M | | | | | | II I I I I II I I I I I I I I I I M I I I I I I I M I I M II I I I 

7 81 AAAGCCT CCGT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGT ATT GACAGATAT CGA 840 



841 
841 
901 
901 



GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

I | | | | | | | I I M I I I I I M I I I M I I M I I I I II M I II I Ml INN I I II I 

GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 



ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

Ml | | | || || I I I I I II I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I M I I 

ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



961 AT AAT T AC GAT GGACT AC AAAGGAAGT TAT CT GC GAAT CTGCTTGCTT C AT CC C GT T C AG 

I | | | M | | | | I I I I I II I I I I I M I I II II II I I I I I I I I I I I MINI 

AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT C AT CC C GT T C AG 



961 
1021 



AAGAC AG CT T T CAT GCAGT TT T ACAAGACAGCAAAAGAT TGGTGGCTGTT C AGT T T CT AT 
| i | l | | | M I I I I I I M I I M I , i M I M I i I M I I I II M M I ! M I I I I I M I I ! M I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 



960 

960 

1020 

1020 

1080 

1080 



1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 
| | | | | | | M | | | M I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I M I I I M I I I II I 
TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 



1081 
1141 
1141 
1201 
1201 
1261 
1261 
1321 



AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

IIIIIMIIIIIMIII | | M I I I II I I I II I I I I I I I I I I I I I I I M I I I 

AGAAAGAAAAGT GG C AT G C AGATT GC T T T AAAT GAT CAC CT AAAG CAGAGAC GGGAAGT G 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

| | M I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

AGCAGGAT T CT GAAGCT CACT C T T TAT AAT C AGAAT GAT C C C AAT AGAT GT GAAC TT T T G 

| I I I I I I I II I I II I I I I I I I I I I I I I I I M I I I I II I M I I I I I II I M 

AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 



1140 
1200 
1200 
1260 
1260 
1320 
1320 
1380 



AGCT T T CT GT T G GT AT T G GAC TAT AT T G GT AT CAAC AT G G CT T CAC T GAAT T C CT G CAT T 

I I II I I I I I II I I II I I I I I I I I I I I MIMIIIIIIIIM 

1321 AG CT T T CT GT T GGTAT TGGACTATATTGGTATCAACATGGCTT CACT GAAT TCCTGC ATT 1380 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAA7WVCTGCTTTAAGTCATGCTTA 1440 



1381 



I | | | | | M | | | M I I I I 1 II I I I I I I I M I I I I I I I I I I II I M I I I I I M I I I 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1440 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| M I II I I I I I I I I I M I I I I I I I M I I M I I I I I I I I I I I I I II I M I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGT T CAAAGCT AAT GAT C AC GGAT AT G ACAAC TTCCGTTC C AGT AAT AAAT AC AGCT C A 1560 

I || | | | | | M I I I I I I I I I M I I I M I I I I I I I I M II I II I I I I 

AAGTT CAAAGCT AAT GAT CAC GGATAT GACAACTT CCGTT CCAGTAAT AAAT AC AGCT CA 



1501 
1561 



1560 
1620 



T CT T GAAAGAAGAACT AT T CAC T GT AT T T CAT T TT CT T T AT ATT GGAC C GAAGT CAT T AA 

I I M I I I I I I I I I I I I I I I I M M I I I M I I I I M I I I I I I I I I I I I I I M I I I M I I I I 

1561 T CT T GAAAGAAGAACT AT T CACT GT AT T T CAT T TT CT T TAT ATT GGAC C GAAGT CATTAA 1620 

1621 AACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T G C ACAGC ACACT AT 168 0 

| | | | M | | || | I M I I I I I I I I I I I I II I I I I I I I I I I I M M I I I I 

1621 AACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T G C AC AGC ACACT AT 168 0 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 174 0 

M | | | | | | | || | | | | I I I I I I I I I I I I I I I I I II I I I I I M I II I I I I I I I I II I 

1681 T AAAAT AT T AAGT GT AATT AT T T T AAC ACT CACAGCT ACAT AT GAC AT T T TAT GAGCT GT 1740 

1741 T T ACGGCAT GGAAAGAAAAT C AGT GGGAAT T AAGAAAGC CT C GT C GT GAAAGCACTT AAT 1800 

| | M I I I I I I I I I I M I I I I I I I I I I M I I M I I I I I I I I I M I I I I I I I I M I 

1741 T T ACGG CAT GGAAAGAAAAT C AGT GGGAAT TAAGAAAGC CT C GT C GT GAAAGCACTT AAT 1800 



1801 T T T TT ACAGT TAG CAC T T CAAC AT AGCT CT TAACAACT T C C AGGAT AT T CAC ACAAC ACT 
| | | | | | M M I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M II I I I I I I 
18 01 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

1861 



1860 
1860 
1920 



TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 
| | | M II I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I M I 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 



1921 
1921 
1981 



AAT CAAT GGGACT CT GAT ATAAAG GAAGAAT AAGT CACT GTAAAACAGAACT TT T AAAT G 

|| | | | | | | | | | | | I I I I I I II I I II I I I I I I M I I I I M I II M M M I I I I I I I I II I I 

AAT CAAT GG GACT CT GAT AT AAAG GAAGAAT AAGT CACT GTAAAACAGAACT TT T AAAT G 



AAGCT T AAAT T AC T CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 

| | | | | | | | | M | | | I I I I I I I II I I I I I I I I I I I M I I I I I I I II I I I I I I M I I 

19 81 AAGCTT AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT TAAT AT 



2041 
2041 
2101 



TAT CAC ACT ATT AT C AGAT T GT AAT T AGAT GCAAAT GAGAGAG C AGT T T AGT T GTT GC AT 

I M M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I > > N I II I I I I I I I I I I I I I I I I I 

TAT CACACT AT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GTT GC AT 



1980 
1980 
2040 
2040 
2100 
2100 
2160 



T TTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

| | | | | | | | | I I I | | I I I I I I I I II I I I I I M M I I I I I I I I I I I I I I M II I I I I I I M I 

2101 TTTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 2160 

2161 TTTT GAAAAT CAT T AC ACT T T CACT AGAAG C C CAAAC C T C AGC AT T CT G CAAT AT GT AAC 2220 

| | | | M | I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I M I I I I 
2161 TTTT GAAAAT CAT T AC ACT T T C ACT AGAAGC C CAAAC CT CAGCAT T CT GCAAT AT GT AAC 2220 



2221 CAACAT GT CACAAACAAGCAGCAT GTAAC AGACT GGCACAT GT GC CAGCT GAATTTAAAA 2280 
| | | | | | | | | I I | | | I I I I I I I I I I I I I I I II I I I I I I I I I I I I M II I M I I I I I I I II I 



2221 CAAC AT GT C ACAAACAAGCAGC AT GT AAC AGACT GGC ACAT GT G C C AGCT GAAT T T AAAA 2280 



2281 TATAATACT TTTAAAAAGAAAATT ATT ACAT CCTTTACATT CAGTTAAGAT CAAACCT CA 

| | | | | | | | II I I I I I I I I I II I I I M I I M ! M I I I I I I I I I I I I I I I I 

2281 T ATAAT ACT TT TAAAAAGAAAAT T ATT AC AT C CTTTACATT CAGTTAAGAT C AAAC CT CA 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 

I | M | | | | | | | | I I I I I I I I I I I I I I I I I I I II I I I M I I I MM 

2341 C AAAGAGAAAT AGAAT GT TT GAAAG GC T AT C C CAAAAGAC f T T TTT GAAT CT GT CAT T C A 



2340 
2340 
2400 
2400 



24 01 CAT AC C C T GT G AAGAC AAT ACT AT C T AC AAT T T T T T C AGGAT TAT T AAAAT CTTCTTTTT 24 60 

I | | | | | | | M M I I II II I M II I I II II M I M II I I I M II I I I I I M I M M I I II I 

2401 CAT AC C CT GT GAAGACAAT ACT AT C T ACAAT T T T T T C AGGATT AT T AAAAT CTTCTTTTT 



2461 



2460 
2520 



TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

I | | M || || M II II II I II II I II I I M I I M II M II II II II M II M II I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 



2521 CT GC AT GT AGAT GAT T AAAT GAG GGC AG GC C CT GT G CT CAT AG CT TT AC GAT G GAGAGAT 

IMIM I I I M I I I II I I I I I M I I MINI MM 

2521 CT G CAT GT AGAT GAT T AAAT GAGGG C AGGC C CT GT G CT CAT AG CT TT AC GAT GGAGAGAT 



2580 
2580 



2581 GCCAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GACAAAGGG 2640 

M | | M | || || || || II II I I II II I I I II M M I I I I I II I I II II I II II 

2581 G C CAGT GACCT CATAATAAAGACT GT GAAC T G CCT GGT GCAGT GT C C AC AT GACAAAGG G 



2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I | | | | M I I II II M I II II II II M II II I II I I I I M M I M I I I M I I I II I I II I I 

2 641 GC AGGTAGCACCCTCTCTCACC CAT GCTGTGGTT AAAAT GGTTTCTAGCATATGT AT AAT 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

| | | M || || || | | | || I || I I I M II I II I M I I I M I I I I I I M II I I M I II II I M I 
27 01 GCTATAGTTiWAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 



2640 
2700 
2700 
2760 
2760 



2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

| M || || | || I I II I II II I II I M I M II I II M I I I M M I I M I I I 

TAAAGCTTATTACTTVATTTTTGTATTATTTTTGTAAATAGCCAATAGT^AAAGTTTGCTTG 



2761 



2820 



2 821 ACAT GGTGCTTTTCTTT CAT CT AGAGGCAAAACT GCT T T T T GAGAC C GT AAGAAC CT CT T 2880 

| M || | || | || | | I II II I II I II I I I II I M I I I I II II M I II I I II M I M M I II I 
2 821 ACAT GGTGCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTT GAGAC CGTAAGAACCTCTT 



2880 
2940 



2881 AGCTTTGTGCGTTCCTGCCT^iATTTTTATATCTTCT.AAGCAAAGTGCCTTAGGATAGCTT 

| | | | | | | | | | | | I I I I I I I I I II II I I I II I I I M II II I I M I I M I I I M I II M I M 
28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 



2 941 GGGAT GAGAT GT GT GTGAAAGTAT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAGGT G 
| | || | | || | | || I || II II II II I II II I I II II II M II I I M I I I II M M I II II M 
2941 GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

IMIIMI M I II II II I II II II II M I M II I I II I II M I II 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 



3000 
3000 
3060 
3060 



3061 C GT C AC AT C AAT G C AAAAG GT C CT GAT T T T GT T C C AGC AAAAC AC AGT G C AAT GT T CT C A 3120 

| | | | | I I I II II II I II I M M I I II M II I M I M I I I M M M I I I M I I 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 



3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

| | | | | | | | | 1 | I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I M I I I > I 
3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTT^AAATATGCCCAA 3180 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 324 0 

| | | M I II I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I 1 I I I I 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT AC AGT AAAC CAAAAC CCAACAAT GT GGC C A 3300 

| | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I M 
3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C C AACAAT GT GGC C A 3300 

3301 GAAAG AAAGAGC AAT AAT AAT T AAT T C AC AC AC CAT AT G GAT T CT ATT T AT AAAT C AC C C 3360 

| | M | | I I I I I I I I I I I I I M M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
3301 GAAAGAAAGAGC AAT AAT AAT T AAT T CAC ACAC CAT AT GGAT T CT AT T TAT AAAT C AC C C 3360 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T C AGAGGC CT GT TAT C AT AGAAGT 3420 

| | | M | I I I I I I I I I I I I I I I I M I I I I I I I M I I I I II I I I I I I 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT CACT T T T T C AGAGG C CT GT TAT CAT AGAAGT 3420 

3421 CAT T TT AG ACT CT CAAT T T T AAAT T AAT TT T GAAT C ACTAAT AT T T T C ACAGTTT AT T AA 3480 

| M I I I I I I I I I I I I I I I I I II I II I M I I I I I I I I M I I I I II I I I I 

3421 CAT T T T AGAC T CT CAAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T C ACAGT T TAT T AA 3480 

3481 TAT AT T T AAT T T CT AT T T AAAT T T TAG AT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 3540 

| | M | I I II I M I I I I M M I I I I I I I I I I I I II M I I II I I I I I I I I i M I I I I 

3481 TAT ATT T AAT T T CT AT T TAAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT TT TT AC A 354 0 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

| | | | | | | | | | I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I M I I M I I II I I I I I I I I 
3541 T C CT GATAC C CTTT C CT T CT C CAT GTCAGTAT CAT GT T CT CT AAT TAT CTTGCC AAAT TT 3600 

3601 T GAAACT ACAC ACAAAAAG CAT ACT T GCAT TAT T T ATAAT AAAATT GC AT T CAGT GGCT T 3660 

| | M I I I I II I I I I II I I I I I I I I I I I I II I I I M I I I I I I I M M M I 

3601 T GAAACT ACAC ACAAAAAGC AT ACT T GCAT TAT T T AT AAT AAAAT T GCATT CAGT GGCT T 3660 

3661 T T T AAAAAAAAT GT T T GAT T CAAAACT T TAAC AT ACT GATAAGTAAGAAAC AAT TAT AAT 3720 

| | | | | | || M I I I I I I I I I I I I I I I M II I II I I I M I I I M I I II II II I M M I I I M 
3661 T T T AAAAAAAAT GT T T GAT T CAAAACT T TAAC AT ACT GAT AAGT AAGAAAC AAT TAT AAT 3720 

3721 T T CT T T AC AT ACT CAAAAC C AAGAT AG AAAAAG GT G CT AT C GT T C AACT T CAAAAC AT GT 3780 

| | | | | | | | | I || I I I I I I II I I I I I II M I I M I I I I I I I M I I M I I I I I I II I I I I I I 
3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 3780 

3781 T T C C T AGT AT T AAG GAC T T T AAT AT AG CAAC AGAC AAAAT TAT T GT T AACAT GGAT GT T A 3840 

| | | | | | | | | || I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I i I I I I I I I I I I 
3781 T T C C T AGT AT T AAG GACT T T AAT AT AG CAAC AGACAAAAT TAT T GT TAAC AT GGAT GT T A 384 0 

3841 CAGCTC7\AAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 

II I I I | I I I II I I I I I I I I I I I I I I I I I I I I M I I I I II I II I I I I I I I M 

3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 

3901 GTGGATGTATGTTCT^AACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 

I I M II I I I I I I I I I I I M I M M I I I II I I i I I I I I I I I 

3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 



Ov 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

I I I | | | | | | I I I I I I I I I I I I I I I 1 I I I I I I M I I I I I M I I I I I I I M M I I I I I I I M 

Db 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 



4020 
4020 



Ov 4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 4080 

MIIMllMIIIMIIIIIIIIIIIIIMIIIMIIMIIMIlllllMIIIIMIil 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 



Db 4021 



4080 
4140 
4140 



Qv 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

HUM IIIIIIIIIIIIMIII I I I I ! I I I I I I I I M I I I 

Db 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

Qv 4141 ACT GT ACAGACACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 4200 

| | | 1 | | I i 1 I I I I I I t I I I I 1 1 I I I I 1 t I I I I I I I I I t I I 1 I I t I 1 I I I I t 1 I ! I I I I I I 

Db 4141 ACT GT AC AG AC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 4200 

Ov 4201 CAAGT GGACAT TAT T TAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 4260 

MIMI | | | I | I I i I I I II I I I I I M I I I M I I I I I I I I I I I I I I M I I M I I 

Db 4201 CAAGT GGACAT TAT T TAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 4260 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

I I I M I I I I M I I M I I I I I I I I I I I 
D b 42 61 AAAATGCCACATTTCTGGTCTCTGGG 4286 



linear PAT 10-JAN-2003 



RESULT 5 
AX587707 

LOCUS AX587707 4286 bp DNA 

DEFINITION Sequence 177 from Patent WO0246467. 
ACCESSION AX587707 

VERSION AX587707.1 GI:28212378 

KEYWORDS 

SOURCE synthetic construct 

ORGANISM synthetic construct 

artificial sequences. 
REFERENCE 1 

AUTHORS Bertucci,F. f Houlgatte, R. , Birnbaum,D., Nguyen, C, Viens,P. and 
Fert,V. 

TITLE Gene expression profiling of primary breast carcinomas using arrays 

of candidate genes 
JOURNAL Patent: WO 0246467-A 177 13-JUN-2002; 
Ipsogen (FR) 
FEATURES Location/ Qualifiers 

source 1. .4286 

/organism^" synthetic construct" 
/mol type="unassigned DNA" 
/db_xref="taxon: 32630" 
/note="primer" 
1. .4286 

/note="endothelin receptor type b (EDNRB) gene." 



misc_f eature 
ORIGIN 

Query Match 



99.6%; Score 4284.4; DB 6; Length 4286; 
Best" Local Similarity 100.0%; Pred. No. 0; 
Matches 4285; Conservative 0; Mismatches 



1; Indels 0; Gaps 



0; 



Qy 



1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 



1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

| I I I I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I I I ! 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 12 0 

121 AG GAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACAT CT GA 180 

| | | | | | | M | I I M II I I I I I I I I I I I I I I M I I I I I II I I I I I II I I M I I I M I I I I I 
121 AG GAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGTACTT GGAGT CT GGACAT CT GA 180 

181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

| | | | | I I I I II I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M 
181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | | M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I II I 
241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

301 T C GC G GAT C T GG GGAGAG GAGAGAG GCTTCCCGCCT GAC AG GGC CACT C C GCT T T T GCAA 360 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I II I 
301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

361 AC C G C AGAGAT AAT GAC GC CAC C CACT AAGAC CT TAT GGC C CAAGGGTT C CAAC GC C AGT 420 

| | | I I I I I I I M I II II II I I I I I I I I I I I I I I I I I I M I I I M 

361 AC C GC AGAGAT AAT GAC G C CAC C CACT AAGAC CT TAT GGC C CAAGGGTT C CAAC G C C AGT 420 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

| | | | | | || I II I I I I II I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I M 
421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

481 C C AC GCAC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGATCAAGGAGACT T T C AAA 540 

| M | | | | | M I I I I I I I I I I II I I II I I I II I I M I I M I I I I M I I I I I I I I M I I I I I 
481 C CAC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGATCAAG GAGACT T T CAAA 540 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

| | I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I II 
541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC G GT C CCAAT AT CT T GAT C GCC 660 

II I I I I I I I I I I i I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C GCC 660 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

I I I I || I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

M | I || | I I | I II I I I II I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

781 AAAGC CT C CGT GGGAAT CACTGT GCT GAGT CTAT GT GCT CT GAGT ATT GACAGAT AT CGA 840 

I I I I I I I I || I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I M I II I II I I I I I 
781 AAAGC CT C CGT GGGAAT CACTGT GCT GAGT CTAT GT GCT CT GAGT ATT GACAGAT AT CGA 84 0 

841 GCTGTTGCTTCTTG GAGT AGAAT T AAAGGAAT TGGGGTTC CAAAAT G GAC AGC AGT AGAA 900 
I I I || I I I | I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I 



841 GCTGTTGCTTCTTG G AGT AGAAT T AAAGGAAT T GG GGT T C CAAAAT G G AC AG C AGT AGAA 900 



960 



901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

| | | | | | | | | I M I I I I I I I M II II I II I I I I I M I ! I I I I I I I I I I I I I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



961 AT AAT T AC GAT G GACT ACAAAGGAAGT TAT C T GC GAAT CTGCTTGCTT CAT C C C GT T C AG 

| | | | | | || | | | | | | M I i M I I I I I I I I II I I I I M I I I I M I I I I I 

961 AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 

1021 AAGAC AG CT T T CAT GC AGT T T T ACAAGACAGCAAAAGAT TGGTGGCTGTT C AGT TT CT AT 
| | || | | | | | || || | | I I I I I I II I II I I I I I I I I M I I I II I I i II I M I M I I I M I I I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

| | M I II I I I I I I I I I I I | I M M I I I I I I I II I I I I II I I I M I II I I M II 

1081 TTCTGCTT GC C AT T G GC C AT C ACT G CAT T T T T T TAT AC ACT AAT GAC CT GT GAAAT GT T G 

1141 AGAAAGAAAAGT G GC AT GCAGAT T GCT T T AAAT GAT C AC CT AAAGC AGAG AC GG GAAGT G 

| I I I I I I I I I 11 I I I I I M M I M I I I I I I I I I II I I M M I I I I II I 

1141 AGAAAGAAAAGT GGC AT G C AGAT T GCT T T AAAT GAT CAC CTAAAGCAGAGAC GG GAAGT G 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

I | | | I M I I I I I M I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I M I I I I I 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 
12 61 AGC AG GAT T C T GAAGCT CACT CT T TAT AAT C AGAAT GAT CC CAAT AGAT GT GAACTT T T G 

I | M 1 1 | | | I I I I I I I I I I I I I I I I I I I M I I I I I M I M I I I I I 

1261 AGC AGGAT T CT GAAGCT CACT C T TT ATAAT C AGAAT GAT C C CAAT AGAT GT GAACT T TT G 



1020 
1020 
1080 
1080 
1140 
1140 
1200 
1200 
1260 
1260 
1320 
1320 
1380 



1321 AGCT T T CT GT T G GTAT TGGACTATATTGGTATCAACATGGCTT CACT GAAT TCCTGC ATT 

|| | I I I I I I I I I I II I I I M I II I I II I I M I I I I I I I I I I I I I M I I I 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

| | | | | || | | | M I I I I I I I I M I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1440 
1440 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 
| | | | | | M M I I I I I I I I M I I I I I I II I I II I I I I I I I I I I I I I I I M I I I II I II I II 
TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 



1441 
1501 
1501 
1561 



AAGT T C AAAG CT AAT GAT CAC G GAT AT GACAACT T C C GT T CCAGT AAT AAAT AC AGCT CA 

I | | I | | | | I I I I I I M I I M I I M I I I I I I I II I I I I II I I I I I I M I II I I I I I 

AAGT T CAAAGCT AAT GAT C AC GGAT AT GACAACT T C C GT T CCAGT AAT AAAT AC AGCT CA 



T CT T GAAAGAAGAACT AT T CACT GT ATT T CAT T T T CT T TAT AT T GGAC C GAAGT CAT T AA 

I I I | | | | I I I M I I I M I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I N 

1561 T C T T GAAAGAAGAACT AT T CACT GTAT T T CAT T T T CT T TAT AT T GGAC C GAAGT CAT T AA 



1621 AACAAAAT GAAACAT TT GCCAAAACAAAAC AAAAAACT AT GT ATTT GC ACAGCACACT AT 

| | | | | | | | || | | || I I I I I I I I II I I I II II I I Mill I I I I I I I I I I 

1621 AACAAAAT GAAACAT T T GC CAAAACAAAAC AAAAAACT AT GTAT T T G CAC AG CAC ACT AT 



1500 
1560 
1560 
1620 
1620 
1680 
1680 



1681 TAAAAT ATTAAGT GTAATT ATTT TAACACT CACAGCT ACATAT GACAT T TT AT GAGCT GT 174 0 

| | | | I I I I I I I I I I I I M II I I I I I I I II I I I I 11 I I I I I M I I I I I I I I I I I I I I I M I 
1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 



1740 



1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAT^GCCTCGTCGTGAAAGCACTTAAT 18 00 

| | M | | | | | | I M I | | | II I I I I I I 1 I I I I I I I M I II I I I I II I I I II I I I I 

1741 T T AC G G CAT GGAAAGAAAAT C AGT G GGAAT T AAGAAAG C CT C GT C GT GAAAGCACT T AAT 1800 

1801 T T T T T AC AGT TAG C ACT T C AAC AT AG C T C T T AAC AACT T C C AGGAT AT T C ACAC AACAC T 1860 

| | | | | M I I I I I I I M M M I I I I I I I I M II I I I I I I I I ' 

1801 TT T T T ACAGT T AGC ACT T CAAC AT AGC T CT T AACAACT T C CAGGAT AT T C AC ACAAC ACT 1860 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

I I I I I I I I I I | M I I I I I I I I I I I I M I I I I II I I I M I I I I I I I I I I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AAT CAAT GGGACT CT GAT ATAAAGGAAGAAT AAGT CACT GTAAAACAGAACTTTTAAAT G 1980 

I M I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I 

1921 AAT CAAT GGGACT CT GAT ATAAAGGAAGAAT AAGT CACT GTAAAACAGAACTTTTAAAT G 1980 

1981 AAG CT T AAAT TACT CAAT T T AAAAT T T T AAAAT C C T T T AAAAC AACT T T T CAAT T AAT AT 204 0 

| | | | | | M I I I II I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I II M I I I I 

1981 AAGCT T AAAT TACT CAAT T T AAAAT T T TAAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 204 0 

2041 TAT C ACAC TAT TAT CAGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GT T GCAT 2100 

M || | | | I I I I I I I I I I I I M I I I I I II I I I I I I I I I M I I I 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

2101 T T T T C GGAC ACT GGAAAC AT T T AAAT GAT CAGGAGGGAGTAAC AGAAAGAG C AAGGCT GT 2160 

M | | | | | | I I I I I I I II II I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

2161 TT TT GAAAAT CATT ACACTT T CACT AGAAGC CCAAAC CT CAGCATT CT GCAAT AT GTAAC 222 0 

M M | | | I I I I I I I I I I I Mill M I I I I I I I I I I I I I I I I I I I I M I I 

2161 T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT C AGC AT T CT G CAAT AT GTAAC 2220 

2221 CAAC AT GT C ACAAACAAGC AGC AT GTAAC AGACT G GC AC AT GT GC C AG C T GAAT T T AAAA 2280 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I II I I I I I I M I I I I I I 
2221 CAAC AT GT C ACAAACAAGC AGC AT GTAAC AGACT G GC AC AT GT GC C AG C T GAAT T T AAAA 2280 

2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 2340 

Ml Ill II I I I I I I I I I I I M I I I I I I I I I I I M I I I I M I I I 

2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 2340 

2341 C AAAGAGAAAT AGAAT GT T T GAAAG G C T AT C C C AAAAGAC T T T T T T GAAT C T GT CAT T C A 2400 

| | | | | || I I I M I I I I I I I I I I I II I I I I I M I I I I I M I I I I I I I I M I I I I I I I I I I I 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

2401 CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T TT T T CAG GAT TAT TAAAAT CTTCTTTTT 2460 

| | | | | | | I | I I I M I I I I I I I I I I II I I I I II M I I I I I I I I I I I I I M I I I I I I I I I I I 
2401 CAT AC C CT GT GAAGACAAT ACT AT CT ACAAT T TT T T CAG GAT TAT TAAAAT CTTCTTTTT 2460 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

| M | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I 
24 61 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 258 0 

|| | | | | | | || | | M | I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 258 0 



GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2 64 0 



2581 ! ! | | | | | | | | | | | | | M I I I I I I I I I I I M I I I I 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 
| | | | | | | M I I I I I I I I M M I I I I I I I I I I I I > I I I I I N I I I I I M I I I I I I I M I I I 
2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

2701 GCT AT AGT T AAAAT ACT AT T T T T C AAAAT C AT ACAG ATT AGT AC ATT T AACAGCT AC CT G 

| | | | | | | | || | | | | | | I | I II II I I II IIIIIIIIIMIIIIIMIIMIIIIII 

2701 GCT AT AGT T AAAAT ACT AT T T T T CAAAAT CAT AC AGATT AGT ACATT T AACAGCT AC CT G 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

M I I I I I | | | | | I I I II II I I II I I I I I I M I I I I M I I I I I M I I I I I II I I I 

27 61 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 
I || | | I I I II II I I I II I I I I I I I I II I I I I I I I I II II I II II I I I II II I II M I I I I 
2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

I M | I I I I | | | | I I I I I I I I II I M I I I I I II I I I I I I I I I I I I I I ! I I 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 



2941 
2941 
3001 
3001 



GGGAT GAGAT GT GT GT GAAAGT AT GT AC AAG AGAAAACGGAAGAGAGAGGAAAT GAGGT G 

I | | | | | | I I I I I I I I I II II I I I I I I M II I II I I I I I I I IN N I I I I 

GGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGAGAGGAAATGAGGTG 

GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

| | | | | | ! | | | | | M I I I I I I II II I M I I I I I I I I I I I I M II I I I I I I I I I I I I M I I I 
GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 



3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 

I II I M I M M I I M II I II I I M M I I M I M I I I I I M I I I M I II M I M 1 1 1 1 

3061 
3121 
3121 
3181 
3181 
3241 



CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

■ | | | | | | | | 1 | | | 1 I I I I 1 I I I I I I I I I I I I t I I I I 1 I I I I f I I I I I I I I I t ■ I 1 I 1 ■ I ■ oion 
GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 



2640 
2700 
2700 
2760 
2760 
2820 
2820 
2880 
2880 
2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 



3240 



ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

M | I | | | | I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 



TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 

I I I M I II I I I I M I I II II I I I I I I I I I I I M I I II I I I M I I I I I I I I I I I I I I I I I I 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 

3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M M I I II 

3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 

MINIM MIIM I Mill I I II I I I I I I I II M M I II M I I I Ml 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 



3421 



CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 



3300 
3300 
3360 
3360 
3420 
3420 
3480 



I M I I I I I I MM I I M II II II I i II II I i H II II 1 I 1 11 

3421 CAT T T T AGACT CT CAAT T T T AAAT T AAT T TT GAAT CACT AAT AT T T T C AC AGT T TAT T AA 
3481 TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T TT T AT T AC CAT GT ACT GAAT T T TT AC A 

M M I I II M M I I I I II I M I I I I M I I I 

34 81 TAT AT T T AAT T T C TAT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 
3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

| M M M I M I I M M I II M M I M I I M I M M I I II I I II I I I Ml 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



3480 
3540 
3540 
3600 
3600 
3660 



3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

M M M I MM I M M II I I I I M II I M II I M I I II II M 

3601 T GAAACT AC AC ACAAAAAGC AT ACT T GC AT TAT T T AT AAT AAAAT T GC AT T CAGT GG CT T 3660 
3661 TTTAAAAAAAAT GTTT GATT CAAAACTTTAACATACT GATAAGTAAGAAACAATT AT AAT 3720 

M M I M I II I I I I M II I II I I II I I M II II I M I M M II M M M 

3661 TTTAAAAAAAAT GTT T GAT T CAAAACTTTAACATACTGATAAGTAAGAAACAATTAT AAT 372 0 
3721 TT CTTTACATACT CAAAACCAAGAT AGAAAAAGGT GCT AT C GTT CAACTTCAAAACAT GT 37 8 0 

M M I M I I II M II M II I M I M II I I M II I M M II M M M II M II M M M II 

3721 TT CT TT ACAT ACT C AAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT T CAAAACAT GT 378 0 



3781 
3781 
3841 



TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

M M I II I I I M II II M I I I I I I I I M II II M II I I I I I I I I M M M M 

T T C CTAGT AT TAAGGACT T T AAT AT AGCAACAGACAAAAT TAT T GT T AAC AT GGAT GTT A 



C AGCT CAAAAGAT T T AT AAAAGAT T T T AAC CT AT TTTCTCCCT TAT TAT C CACT GCT AAT 

M M I M I M I M I I M M II M I II M I I M I I II I I I M I I M I M II I M I II I I M 

3 841 C AGCT CAAAAGAT T TAT AAAAGAT T T T AAC CT AT TT T CT C C CT T AT TAT C CACT GCT AAT 
3901 GT GGAT GT AT GT T CAAAC AC CT T T T AGT ATT GAT AGC T T AC AT AT GGC C AAAG GAAT AC A 

M M I | M II I I I M I I M II I I M M M M M I M I II II I M M M M I II I 

GT GGAT GTAT GTTCAAAC ACCTTTT AGT ATT GATAGCTTACAT AT GGCCAAAGGAAT AC A 



3901 
3961 
3961 
4021 
4021 



GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

M M I M II I II II II II I M I M II I II II II II I M I I I II II I I M M M 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 
AAAAAAT TAT AT AT CT G G GAGGAT TT T T T GGT T G C CT AAAGT G GCT AT AGT TACT GAT T T 

M M I II I M II I II M I I M II M M II I II I I M I M M I I II M II I I I M M I II I 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 



4081 T T TAT TAT GTAAGCAAAAC C AAT AAAAATTT AAGT T T T TT T AACAAC T AC CT T AT T T T T C 

M I I I I I II M I I II I I M M I I II I II I I I I I M II I M II I II I I II II II I I I I I M 

4081 T T TAT TAT GTAAGCAAAAC CAAT AAAAATT T AAGT T T T T T T AAC AACT AC CT TAT T T T T C 

4141 ACT GT ACAGACACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT G A 

M M I M | I II I I II I I M I I M I I II I II I I M II I I I I M II III I M II I M 

4141 AC T GT ACAGACACT AAT T C AT T AAAT AC T AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 

4201 CAAGT GGAC AT TAT T TAT GT T AAAT AT ACAATT AT CAAG CAAGT AT GAAGT TAT T CAAT T 

M I I I II I II I I I I I M I II II M M II I M I M I I I II II M II II II I M I I 

4201 CAAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 



3840 

3840 

3900 

3900 

3960 

3960 

4020 

4020 

4080 

4080 

4140 

4140 

4200 

4200 

4260 

4260 



4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 
II I I I I I I II II II I II II M II I I I 



Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 
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S57283 4286 bp mRNA linear PRI 18-MAR-2002 

Homo sapiens endothelin ET-B receptor mRNA, complete cds . 

S57283 

S57283.1 GI:298321 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 4286) 

Arai,H., Nakao,K., Hosoda,K., Ogawa,Y., Nakagawa,0., Komatsu,Y. and 
Imura, H . 

Molecular cloning of human endothelin receptors and their 

expression in vascular endothelial cells and smooth muscle cells 

Jpn. Circ. J. 56 Suppl 5, 1303-1307 (1992) 

93180293 

1291713 

GenBank staff at the National Library of Medicine created this 
entry [NCBI gibbsq 128424] from the original journal article. 
This sequence comes from Fig. 5. 

Location/Qualifiers 

1. .4286 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
238. .1566 

/gene="endothelin ET-B receptor, ET-BR" 
/note="This sequence comes from Fig. 5; ET-BR" 
/ codon_start=l 

/product="endothelin ET-B receptor" 
/protein_id="AAB25531 . 1" 
/db_xref="GI: 298322" 

/trans la tion="MQPPPSLCGRALVALVLACGLSRIWGEERGFPPDRATPLLQTAE 
IMTPPTKTLWPKGSNASLARSLAPAEVPKGDRTAGSPPRTISPPPCQGPIEIKETFKY 
INTWSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIVIDIPINVY 
KLLAEDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWT 
AVEIVLIWWSVVIAVPEAIGFDIITMDYKGSYLRICLLHPVQKTAFMQFYKTAKDWW 
LFSFYFCLPLAITAFFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYNQNDPNRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQSFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 



ORIGIN 



Query Match 99.6%; Score 4284.4; DB 9; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 4285; Conservative 0; Mismatches 1; 



Length 4286; 
Indels 0; Gaps 



0; 



Qy 

Db 

Qy 



1 G AGAC AT TCCGGTGGGG GACT CT G GC C AGC C C GAG CAAC GT GGAT C CT GAGAG CACT C C C 

I I I I I I MINI I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 

1 GAGAC AT TCCGGTGGGG GACT CT G GC C AGC C C GAG CAAC GT G GAT C CT GAG AGC ACT C C C 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 
M | | | | | | | I I II I I I I I M I I I I I I I I t I I I I I M II I M I I I II I I I I I M I I I II I I 



60 



60 



120 



Db 



61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 



Qy 121 AG GAT CAAC AC AGT GG C T GAAC AC T GG GAAGGAACT G GT ACT T G GAGT C T G GAC AT C T GA 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I 
Db 121 AG GAT CAAC ACAGT GG C T GAAC AC T GG GAAGGAACT GGT AC T T G GAGT CT G GAC AT C T GA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I II I I M 
Db 181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I I I I I I I I I I i II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 AC C GC AGAGAT AAT GAC GC CAC C C ACT AAGAC CT T AT GG C C CAAGGGTT C CAAC G C C AGT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AC C GC AGAGAT AAT GAC GC CAC C C ACT AAGAC CT TAT GGC C CAAGGGTT C CAAC G C C AGT 420 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

Qy 481 C CAC GC AC CAT CTCCCCTCCCCC GT GC CAAG GAC C CAT C GAGAT CAAGGAGACT T T C AAA 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 C CAC GC AC CAT CTCCCCTCCCCC GT GC CAAG GAC C CAT C GAGAT CAAGGAGACT T T CAAA 54 0 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I i I I I I I I I I I 
Db 541 T AC AT CAAC AC GGTTGTGTCCTGCCTT GT GT T C GT GC T GGGGAT CAT CGG GAACT C CAC A 600 

Qy 601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C G C C 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 CT T CT GAGAAT TAT CT ACAAGAACAAGT G CAT GC GAAAC GGT C C CAAT AT CT T GAT C GC C 660 

Qy 661 AGCT T GG CT CT GGGAGAC CT GCT G CAC AT C GT CAT T GAC AT C C C TAT CAAT GT CT AC AAG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 AGCTTGGCTCT GGGAGAC CT GCT GCACATCGTCATTGACATCCCTAT CAAT GTCTACAAG 720 

Qy 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

Qy 781 AAAG CC T C C GT G G GAAT CACT GT GC T GAGT CTAT GT GCT CT GAGT AT T GAC AGAT AT C GA 840 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 AAAGC CT C C GT GGGAAT CAC T GT GC T GAGT CTAT GT GCT C T GAGT AT T GACAGAT AT C GA 840 

Qy 841 GCTGTTGCTTCTT GGAGT AGAAT T AAAGGAAT T G GG GTT C CAAAAT GGACAGCAGTAGAA 900 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 841 GCTGTTGCTTCTT GGAGTAGAAT T AAAGGAAT T G G G GTT C CAAAAT G GACAGCAGT AGAA 900 

Qy 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I 
Db 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



Qy 961 AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT G C GAAT CTGCTTGCTT CAT C C C GT T CAG 1020 

I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I 

Db 961 AT AATT AC GAT GGACT ACAAAG GAAGT TAT CT G C GAAT CTGCTTGCTT CAT CC C GT T CAG 1020 

Qy 1021 AAGACAGCT T T CAT GC AGT T T T ACAAGACAG CAAAAGAT TGGTGGCT GT T C AGT T T CT AT 108 0 

I I I I M I I I I I I I I I I II I I I I I M M I M I I I I I I M I I I II I I I II I I I I I I I I I I M 

Db 1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 108 0 

Qy 1081 T T CT GCT T GCCAT T GG C CAT C AC T GC AT T T T T T TAT AC ACT AAT GAC CT GT GAAAT GT T G 1140 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I II I 

Db 1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAAT GAC CTGT GAAAT GTTG 1140 

Qy 1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 

I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

Db 1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 

Qy 1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

II I I I I I II I I I I I I I I I I I I I I I I II I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I 

Db 1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

Qy 1261 AGCAGGATT CT GAAGCT CACT CTTTATAAT CAGAAT GAT CC CAATAGAT GT GAACTTTT G 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 AG CAGGAT T CT GAAGCT CACT CT T TAT AAT CAGAAT GAT C C CAATAGAT GT GAAC T T T T G 1320 

Qy 1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

Qy 1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

Qy 1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

Qy 1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 

I I I I I I I I I I I I I I I II I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 1501 AAGTTC7^AAGCT7\ATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 

Qy 1561 T CT T GAAAGAAGAACT ATT CACT GT AT TT C AT T T T CT TT AT AT T GGAC C GAAGT CAT T AA 1620 

I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I 

Db 1561 T CT T GAAAGAAGAACT ATT CACT GT AT TT C AT T T T C T T TAT AT T GGAC C GAAGT CAT T AA 1620 

Qy 1621 AACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT T T G C AC AG CACACTAT 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 AACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC ACAGCACACT AT 1680 

Qy 1681 T AAAAT AT T AAGT GTAATT AT T T T AAC ACT CACAGCT ACAT AT GAC AT T T TAT GAGCT GT 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I II II I I I I I I 

Db 1681 T AAAAT ATT AAGT GTAATT AT T T T AAC ACT CACAG CT AC AT AT GACAT T T TAT GAGC T GT 1740 

Qy 1741 T T AC G GC AT G GAAAG AAAAT C AGT G G GAAT T AAGAAAGC C T C GT C GT GAAAG C ACT T AAT 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 



1801 T TT TTACAGTTAGCACTT CAACAT AGCT CT TAACAACTT CCAGGATATT CACACAACACT 
I I I | | | | | | | | M I 1 I I I I I I I M I I I I I I I I I M I I I 1 I > N I II I I I I I I I M I I I I I 
18 01 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTT CCAGGATATT CACACAACACT 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

I I I I I I | M I II I I I M I I II I I M I I M I I I I I M II I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 
1921 AATCAAT GGGACTCT GATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 

IN MM IIIMIIMMIIMIIIIIIMMIMMIMIIIMM 

AAT CAAT GGGACT CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAACAGAACT T T T AAAT G 



1921 
1981 
1981 
2041 
2041 
2101 
2101 
2161 
2161 
2221 
2221 



AAG CT T AAAT TACT CAAT T T AAAATT T T AAAAT CCT T TAAAACAACT T T T CAAT T AAT AT 

M | I I M | | I I I M M I II I I I I I I I M I I I t I M I I I I II I I I I I I t I I M I I I 

AAGCT T AAAT TACT CAAT T T AAAATT T T AAAAT C CT T TAAAACAACT T T T CAAT TAAT AT 
TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT GCAAAT G AGAGAGCAGT T T AGT T GT T GC AT 

I | | | | | || | | | || | M I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 

TAT CAC ACT AT TAT C AGATT GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GT T GC AT 
TTTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 

| | | I I I I I I I I I IIIIIIIIIMIMIMIIIIIIMIIIIIMMII 

TTTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 
TTTT GAAAAT CAT T ACACT TT CACT AGAAGC CCAAAC CT CAGCAT T CT GCAAT AT GT AAC 

IIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIMIIMIIIMIIIIIIIMIIIIIM 

TTTT GAAAAT CAT T AC ACT TT CACT AGAAGC C CAAAC CT CAGC ATT C T GCAAT AT GT AAC 
CAACAT GT CACAAACAAG CAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAATTTAAAA 

I | | || | | | M I I I I I I M I I I I I I I I I I N I II I I I M I I I M I I I I I M I I I II M I I I 

CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAATTTAAAA 



2281 T AT AAT ACT TT T AAAAAGAAAAT T AT T AC AT C CT T T ACAT T C AGT TAAGAT CAAAC CT C A 

I II I I I I | I M I I II I II M M I I I I I i I I I i I I M I I I I I I I M I I I 1 I I I I I I 

2281 T AT AAT ACTT T T AAAAAGAAAAT TAT T AC AT C CT T T ACAT T CAGTT AAGAT CAAAC CT C A 
2341 CAAAGAGAAAT AGAAT GTT T GAAAG GCT AT CC C AAAAGACT T TTTT GAAT CT GT CAT T C A 

I | | | M I I II I I M I I M I I M I I I I M M I II Ml II I I II I I M II I I I I M I I I M I 

CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 



2341 
2401 
2401 
2461 
2461 



CAT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T CAG GAT TAT T AAAAT CTTCTTTTT 

I | | | | | | | | | M M I II I M M I I I I I I I I M I I M I I I I I I I I I M I I I M M I I M I I 

CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 
TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

I I I I I I | | | 1 1 I I t I 1 1 I I I I I I I I I I t I I I I I I 1 I I I M I I I t I I I I I I I I I I I 1 I I I I 

T CACT AT CGT AGCT T AAACT CT GTT T GGTT TT GT CAT CT GT AAAT ACTT ACCT ACAT ACA 



2521 CT G CAT GT AGAT G AT T AAAT GAG G GC AGG C CCTGTGCT CAT AGCT T T AC GAT G GAGAGAT 

I MMIIIIMMIIIIIMIIIIIIIIMIIIIIIIIIIIMIIIIIMIMM 

2521 CT GC AT GT AG AT GAT T AAAT GAGGGC AG GCCCTGTGCT CAT AGCT T T AC GAT GGAG AGAT 



2581 GC CAGT GAC CT CAT AAT AAAGACT GT GAAC TGCCTGGTG C AGT GT C CAC AT GAC AAAG GG 

| | || II I I M I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I M I N II 

2581 GC CAGT GAC CT C AT AAT AAAGAC T GT GAACT GC CT G GT G CAGT GT C C ACAT GACAAAGG G 



1860 
1860 
1920 
1920 
1980 
1980 
2040 
2040 
2100 
2100 
2160 
2160 
2220 
2220 
2280 
2280 
2340 
2340 
2400 
2400 
2460 
2460 
2520 
2520 
2580 
2580 
2640 
2640 



2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



I i i i i I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I M I I M I 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 27 00 



27 01 GCT AT AGT T AAAAT ACT AT T T T T CAAAAT CAT AC AGAT T AGT AC AT T T AAC AGCT AC C T G 

! | | | | | M | I I M I M M M I I M I I I I I I M I I I I I 1 I M I I I I II M I 

G CT AT AGT T AAAAT ACT AT T T T T CAAAAT CAT AC AGAT T AGT AC AT T T AAC AGCT AC CT G 



2701 
2761 
2761 
2821 
2821 
2881 
2881 
2941 
2941 



TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

IIIIIIIIIMIIIIIIII Mlllll I I I I I II I I I M M I II 

T AAAG CT TAT T ACT AAT T T T T GT AT TAT T T T T GT AAAT AGC C AAT AG AAAAGT T T G CT T G 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

M I I I I M I I I I I I I I I I M I I I I I I I M M I I I M I I I I M I I I I I I I M I I I I I I I II 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 
AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

I I I I | | | M II I I I I I I I I I I I M I I I I I I M M I I II I I I I I I I I I I I 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAG AGAAAAC GGAAGAGAGAGGAAAT GAGGT G 

1 I I I I I I 1 I I I t I I I I 11 I I 1 1 I I I I I I I I I I 1 I I I 1 I I I 1 I I I M I I I I I I t I I 1 I 1 I I 

GGGAT GAGAT GT GT GT GAAAGT AT GTACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 



3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

Mill MM I I Ml II IN M M I MM Ml Mlllll II I 111,111 ' ' ' ' 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 



3061 
3061 
3121 
3121 
3181 
3181 



C GT CAC AT CAAT GC AAAAGGT C CT GATT T T GT T C CAG CAAAAC AC AGT GC AAT GT T CT C A 

I | | M || || II II I I II I M II II M II II I II I I M M I II II M II I I II M I II II I 

C GT CAC AT CAAT G CAAAAGGT C CT GAT T T T GT T CCAG CAAAACACAGT G C AATGT T CT C A 
GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

I | | | | | | | | M II I M II I I I M M M I I M M M II I M I II M II II I II M I II II I 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 
ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

I | M || || | | | | M I II I I M II I I I I M II II I II I II I I I M M I I I I M I I I I II I I 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTT^ATG 



2760 
2760 
2820 
2820 
2880 
2880 
2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 
3180 
3240 
3240 
3300 



3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAAC AAT GT GG C C A 

| | | | | | | | || || | | | || I I M II II II I I II I II I I II II I M I I I I I M I M I I 

TTGTTTTCTGT CAAT ATT GAAT GT GAT G GT AC AGTAAAC CAAAAC C CAACAAT GT G GC CA 3300 



3241 



3301 GAAAGAAAGAGCAAT AAT AAT T AAT T C ACACAC CAT AT GGAT T CT AT TT AT AAAT CAC C C 
| | | | | | | | | | || M II II II I M I I M I M M I I II I I I M I I I M I I I I I I I M I I I M 
GAAAGAAAGAGCAAT AAT AAT T AAT T C ACACAC CAT AT G GAT T CT AT T TAT AAAT CAC C C 



3301 
3361 



ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T C AGAGGC CT GTT AT C AT AGAAGT 

I || | | | || || | || | || I M II II M II II II M I II II II II M I I II I I M I M 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT TT T T C AGAGG C CT GT T AT CAT AGAAGT 



3421 CAT T T T AGACT CT CAAT T TT AAAT T AAT T T T GAAT C ACT AAT AT T T T C ACAGT T TAT T AA 

I | | | | | | | || || | I II II II II I I I M I I II I I I I I I MIMMMMII 

3421 CAT T T T AGACT CT CAAT T TT AAAT T AAT T T T GAAT C ACT AAT AT T T T CAC AGT T TAT T AA 



3360 
3360 
3420 
3420 
3480 
3480 



3481 TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T T T ATT AC CAT GT ACT GAAT T TT T AC A 354 0 

I | | | | | | || I II I I I M I M II II II II II I I I M IMIMMMIMI 



TAT AT T T AAT TT CT AT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 3540 



3541 T C CT GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GTT CT CT AAT TAT CT T GC CAAAT T T 

MUM | I I I I 1 I I I I I I I I I I I I I I I I 1 I I I I I ■ M I 1 I 1 I I I I I I I I I I I I I I 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

3601 T GAAACT AC AC ACAAAAAG C AT ACTT G CAT TAT T T AT AAT AAAAT T G CAT T C AGT GGCT T 

| | | | | | | | | M M | I I I I I I I II I I I Mill MMl, Mill! 

3601 T GAAACT ACACACAAAAAGC AT ACT T GCAT TAT T TAT AAT AAAAT T GC AT T C AGT GGCT T 



3661 

3661 

3721 

3721 

3781 

3781 

3841 

3841 

3901 

3901 

3961 

3961 

4021 

4021 



T T TAAAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GAT AAGT AAGAAACAAT TAT AAT 

MINI I II I I I I I I I M I I I I M II I I II I M I I I I I I I 

T T TAAAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT G AT AAGT AAGAAAC AATT AT AAT 



3600 
3600 
3660 
3660 
3720 
3720 
3780 



T T CT T T AC AT ACT CAAAAC CAAGAT AGAAA?\AG GT GCT AT C GT TCAACT T CAAAACAT GT 

| I || | lllllll II III MM I I MINI I IMM Mill III II M I Ml I IN Mill 

TT CT T T AC AT ACT CAAAAC CAAG AT AGAAAAAGGT GCT AT C GT T CAACT T CAAAACAT GT 3780 



T T C CT AGT AT T AAGGACT T T AAT AT AGCAAC AGACAAAAT TAT T GT T AAC AT GGAT GT T A 

MIMIMMIMIIIIMIMIIMIIIMIMMIM MINIMI 

TT CCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACAT GGAT GTT A 
C AGCT CAAAAGAT T TAT AAAAGAT TT T AAC CT AT TTTCTCCCT TAT TAT C C ACT GCT AAT 

IMIIIIM I I I I I I I I I I I I I I I M I I I M I I I M I II II II I I M II 

C AGCT CAAAAGAT T TAT AAAAGAT T TT AAC CT AT T T T CT C C CT T ATT AT C C ACT GCTAAT 

GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCT T ACAT AT GGC CAAAGGAAT AC A 

| | | | | | I M I I M I I I I I M II li II I M II II II I II I I 

GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCT T AC AT AT GG C CAAAGGAAT AC A 

GT TTAT AGCAAAACAT GGGT ATGCT GT AGCTAACTTT ATAAAAGT GTAAT ATAACAAT GT 

I | M | || || || | M II II II M I I I M I M M II III I M I M I I I II I I I M I I I M I I 

GT TTAT AGCAAAACAT GGGT ATGCT GT AGCTAACTTT ATAAAAGT GTAAT ATAACAAT GT 



AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

| | | | | | | | | | | | || II II I II I II II I M II II M I I I I I M II II II I I M I 

AAAAAAT TAT AT AT CT G GGAGGAT T T T T T GGT T G C CT AAAGT GGCT AT AGT TACT GAT T T 



4081 T T TAT TAT GT AAGCAAAAC CAAT AAAAATT T AAGT TT T T T T AACAACT AC CT T AT T T T T C 

Mill | I I I I I I I I II I II I I II I I I II I M MIM, MINIM 

T T TAT TAT GT AAGCAAAAC CAATAAAAAT T T AAGTTT T T T T AACAACT AC CT TAT T T T T C 



4081 
4141 
4141 
4201 



ACT GT ACAGACACTAATT CATTAAAT ACTAATT GATT GTTTAAAAGAAAT ATAAAT GT GA 

I I I I | | | I II II I I II II II I II I I II II I I M II II I I II I I M I I I M I M I 

ACT GT ACAGACACTAATT CATTAAAT ACTAATT GATT GTTTAAAAGAAAT ATAAAT GT GA 



CAAGT GGACAT TAT TTAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 

| | | | || || | || || | II I II I I I II II II I II II II M I I I M II II M II I M I II I M I 

4201 CAAGT G G ACAT TAT T TAT GT T AAAT AT AC AAT TAT C AAGC AAGT AT GAAGT TAT T CAAT T 

4261 AAAAT GC CACATTT CT GGT CT CT GGG 42 86 

I I I I I I I I I II II II I M I I M I I I I 
4261 AAAAT GCCACATTTCTGGTCTCT GGG 4286 



3840 

3840 

3900 

3900 

3960 

3960 

4020 

4020 

4080 

4080 

4140 

4140 

4200 

4200 

4260 

4260 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SEGMENT 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



Hosoda,K., Ogawa,Y., Nakanishi , S . and 



FEATURES 

source 



gene 



CDS 



D13162S7 2972 bp DNA linear PRI 12-OCT-2002 

Homo sapiens hET-BR gene for endothelin-B receptor, complete cds 
and exon 7 . 
D13168 

D13168.1 GI:285924 
7 of 7 

Homo sapiens (human) 

EukarJotirSetazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 2972) 
Arai,H., Nakao,K., Takaya,K., 

S^human endothelin-B receptor gene. Structural organization and 

chromosomal assignment 

J. Biol. Chem. 268 (5), 3463-3470 (1993) 

93155196 

8429023 

2 (bases 1 to 2972) 
Arai,H. 

Direct Submission 

Submitted (02-SEP-1992) Hiroshi Arai, Kyoto University School of 
Medicine, Second Division, Department of Medicine; 54 Shogom, 
Kawahara-cho, Sakyo-ku, Kyoto, Kyoto 606, Japan 
(Tel: 81-75-751-3170, Fax:81-75-771-9452) 
Location/Qualifiers 
1. .2972 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
/chromosome="13 M 

/clone_lib="human genomic libraries" 
join (D13162. 1:1002. . 1742, D13163 . 1 : 11 - .123, 
Dl3164.1:ll. . 215, D13165. 1:11. . 160, D13166 . 1 : 11 . .144, 
D13167.1:ll. .119,11. .2865) 
/gene="hET-BR" 

join(D13162. 1:1260. . 1742, D13163 . 1: 11 . .123, 
D13164.1:ll. .215, D13165. 1:11. . 160, D13166 . 1 : 11 . .144, 
D13167.1:ll. .119,11. .145) 
/gene="hET-BR" 

/note="G protein-coupled receptor" 
/codon_start=l 

/product="endothelin-B receptor" 
/protein_id="BAA02445 . 1" 
/db xref="GI:285926" 

/ translation="MQPPPSLCGRALVALVLACGLSRIWGEERGFPPDRATPLLQTAE 
IMTPPTKTLWPKGSNASLARSLAPAEVPKGDRTAGSPPRTISPPPCQGPIEIKETFKY 
INTWSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIVIDIPINVY 
KLLAEDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWT 
AVEIVLIWWSWLAVPEAIGFDIITMDYKGSYLRICLLHPVQKTAFKQFYKTAKDWW 

L F S FY F CL P LAI T AF F YT LMT C EML RK K S GMQ I ALN DHL KQ RREVAKT VF C L VLVFAL 
CWLPLHLSRILKLTLYNQNDPNRCELLSFLLVLDYIGINMAS LNSCINPIALYLVSKR 
FKNCFKSCLCCWCQSFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS" 



exon 



11. .2865 
/gene="hET-BR" 

/product="endothelin-B receptor" 



/note="G protein-coupled receptor" 
/number =7 

/evidence=experimental 

ORIGIN 

Query Match 66.4%; Score 2857; DB 9; Length 2972; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2 857; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Ov 1430 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAA7VACAGTCCTTGGAGGAAAAGC 1489 

I | M | | | I I I I M II I II I I I I I I I M M I I I I M I I I I I I I I I I I I I I 

Db 9 AGT CAT GCT T AT GCTGCTGGT GC C AGT CAT T T GAAGAAAAAC AGT C CT T G GAGGAAAAG C 68 

Ov 1490 AGT C GT GCT T AAAGT T C AAAG CT AAT GAT C AC GGAT AT GACAACT T C C GT T CCAGT AAT A 1549 

| | | | M | | | | || M I I I I II I I I I I I I I I I I I M Mill I 1 

Db 69 AGT C GT GCT T AAAGT T CAAAGC TAAT GAT C AC GGAT AT GACAACTT C C GT T C CAGT AAT A 128 

Qy 1550 AAT AC AGCT CAT CT T GAAAGAAGAAC TAT T CACT GT AT T T CAT T T T C T T TAT AT T GGAC C 1609 

1 t I t I I I I I I 1 1 I I I I M I I I I i I I I I I I M M t I I I i I I I M I I M 

Db 129 AAT AC AG CT CAT CT T GAAAGAAGAACT AT T CACT GT AT TT C AT T T T C T TT AT AT T GGAC C 18 8 

Ov 1610 GAAGT CATTAAAACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GTATTT GCA 1669 

| | | || | || | || | | | I I I I I M I I I I I I I I I I M M I I I M I I I I I I I I I I I I I I I 

Db 18 9 GAAGT CATTAAAACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACT AT GTATTT GCA 248 

Ov 1670 CAGCACACTATTAAAAT ATTAAGT GTAATT ATTTTAACACT CACAGCT ACAT AT GACATT 1729 

Ml | M MINIM M I M I I II II II M II I II II M M I M M 

Db 249 CAGCACACTATTAAAAT ATTAAGT GTAATTATTTTAACACTCACAGCTACAT AT GACATT 308 

0v 1730 T TAT GAGCT GT T T AC GGCAT GGAAAGAAAAT CAGT G G GAAT T AAGAAAGC CT CGT CGT GA 

M | | I I I I II I I II M II II II I M M II I II II I I M I II II I II II I M II I II II II 
Db 309 TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGA 

Ov 1790 AAG CACT TAAT T T T T T ACAGTT AGC ACT T CAAC AT AGCT CT T AACAACT T C C AGGAT AT T 

I | | || II II I || M I II M II M I I I II I M II II I I II M II II II I I M II I 

Db 369 AAGCACT T AAT T T TT T ACAGT TAGC ACT T CAACAT AGCT CTT AACAACT T C C AGGAT AT T 

Qv 1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 

| | | | | | | || M II I II I II I I II I I I II I II M II II I I M I M I M M 

Db 429 CACACAACACTTAGGCTTATW^TGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 



Qy 1910 
Db 489 



ATTTATTTTTAAATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGA 

| | | | M | M II M M I M II I II I M I I M I II III M I II I I I II M M M I M M M I 
AT T TAT T T T TAAAT CAAT GG GACT CT G AT ATAAAGGAAGAAT AAGT C ACT GT AAAAC AGA 



Ov 197 0 ACT T TT AAAT GAAGCT TAAAT TACT CAAT T T AAAATT T T AAAAT C CT T T AAAACAACTT T 

I | | | | | | M || II I II I M I I M I I M I I II II I M I II II II I M I M 

Db 54 9 ACT T TT AAAT GAAG CT TAAAT TACT CAAT T T AAAATT T T AAAAT C CT T T AAAACAACT T T 



1789 
368 
1849 
428 
1909 
488 
1969 
548 
2029 
608 
2089 



Ov 2 030 T CAAT TAAT AT TAT CAC ACT AT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT TT 

I | | | | | | M II I II I I I I I M M I I I MIMMMMMMMIMMMIMM 

Db 60 9 T CAAT TAAT AT TAT C ACAC TAT TAT C AGAT T GT AAT T AGAT G C AAAT GAGAGAGCAGT T T 668 

Ov 2090 AGT T GT T G CAT T T T T C GGACACT G GAAAC ATT TAAAT GAT C AGGAG GGAGT AAC AGAAAG 

Ml I I M M II I II I I I II I M M I II I II I II II II I II I M 

Db 669 AGT T GT T GC AT T T T T C G GAC ACT G GAAAC AT T TAAAT GAT C AG GAGG GAGT AACAGAAAG 



2149 
728 



715 0 AGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTG 2209 

2150 ^ Y I TT I I I I I I 1 I I I I i I 1 I I I I I I I I I > 1 « ' ' Ml Ml MINN M 

729 AGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTG 788 

2210 CAAT AT GT AAC C AAC AT GT CAC AAAC AAGC AGCAT GT AACAGACT GGC AC AT GT GC C AGC 2269 

Ml u MUM HUH I I III Ml I III I UNI I NIMH I I 

78 9 CAAT AT GT AAC C AAC AT GT CAC AAACAAGC AGCAT GT AAC AGACT GGC AC AT GT GC C AGC 848 
2270 TGAATTTAAAATATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAG 2329 

I MM I I I I M I I II II I I I II I I M I M I 

849 ~~ 

2330 



TGAATTTAAAATATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAG 9 0 8 
ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 238 9 



M i| II M II I I M I II II II I II I I II I M M II M II II I II II I I II M II II II I I 

9 0 9 ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 
2390 TCTGTCATTCACATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAA 

I || | | || | | | || I I I I I II II I I II I Ml II I I MM II I M I M II I II M I II I I II I 

9 6 9 TCTGTCATTCACATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAA 



968 
2449 
1028 
2509 



2 450 ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 

I I | II I II I I M I II M I I II I I I I I I I M I II MM I I I II I M M I II M II M II II 

ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 1088 



1029 



2569 



2510 ACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTAC 

MMIIIMI || I I I I I II I II I II II I M I M I II II II I II II II II M II M 

ACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTAC 1148 



1089 



2629 
1208 
2689 



257 0 GATGGAGAGATGCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCAC 

I M II I I I I M II I I M I II II II I I M I M II I III I I II I M I II M I II I I 

1149 GAT GGAGAGAT GCCAGT G ACCT C AT AAT AAAGACT GT GAACT GCCTGGTG C AGT GT C CAC 

2630 ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 

I I I I I I I M M II I I II M I M I M I I I I M I I I Ml liiiiiiililiili.iiiUlij. 

1209 " om^a,m 

2690 ~ Tf T 1 7f i TT T YT VT T T1T1T 1 1 1 1 1 h m mi m i _i i m m i mm m ^ m Mm 

1269 

2750 ACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAA 2809 

1 1 M ii 1 1 m ii m i ii i ii i ii 1 1 1 1 ii m I mm.!!! Ililiiiiiiii! ' 

1329 
2810 



ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 1268 

ATATGTATAATGCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTA 2749 

I | || || | | M M II I M M II M II II I M I II III II II I II II II II II I M M II I I 
ATATGTATAATGCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTA 1328 



ACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAA 1388 
AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 28 69 



i m ii ii i m ii 1 1 1 m ii ii i ii i ii 1 1 ii mm i mm i M!!!!!!!!!!i!iim 

1389 



AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 1448 



2870 AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 2929 

1 1 1 m m ii 1 1 1 1 ii i ii ii ii in ii i m !!!!!!! !!!!!!!! 

1449 
2930 



M || I II I I M II M H II I I I I I I M M m m i 1 , cno 

AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 1508 

TAG G AT AG CT T GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG 2 989 



I M II || II M I I II I II M I II I II M II MM M M IMMIM M Ml 

1509 " 



TAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGAGAG 1568 



2990 



GAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTT 3049 



1569 
3050 



i i i I i i i i i I i I I I I I I | M | | I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I 

gJJvatgaggtggggttc 1628 

CGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTG 3109 

1629 CGTCATTGCCTCGTC^ 1688 
3110 CAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCC^GAGCTTTAACTC 3169 

1689 CAATGTTCTCAGAOT 1748 

317 0 AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 3229 
3170 AATATGCCCAAA ,,,,,,,,, ,,,,,, , , | , , , | | | | | | | | | | | | | | M I 

1749 AATATGCCCAAATTTTTACT 1808 

3230 AGCT AGT AAT GT T GTT T T CT GT C AAT ATT GAAT GT GAT GGT AC AGT AAAC C AAAACC C AA 3289 

3230 AGCTAGTAA G , | | | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I M I I I M 

1809 AGCT AGT AAT GTT GTT T T CT GT C AAT ATT GAAT GT GAT GGT AC AGT AAAC C AAAAC C C AA 1868 

3290 CAAT GT GGCCAGAAAGAAAGAGCAATAAT AATTAATT CACACAC CAT AT GGATT CT ATTT 3349 

i i i I l l I I I I I I I I I I I I M I I M I I I I M I M I I I I M I I I M I I I I I I I I I I I I I N I 

1869 cIaTGTGGCCAgZ^^ 1928 
ATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTT^ 3409 

i i i i i i i i i i i i i i i i i i i I i i i m i ii i i ii i i i i i ii i i i i i i i i i i i u m igg8 



3350 

1929 ATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTT 
3410 ATCATAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAA^ 

I I I M | I I | M | | | | | I I I I I I I I I I M I I I M I I I M I I I I I M I I I I I I I I 

1989 ATCATAGAAGTCATTTTAGACTCTCAATTTTAAAT 



3469 
2048 
3529 



CAGTTTATTAATATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACT 

2108 



3470 CAGTTTATTAATATATTT/^i 11^1^1^™^ — 

I I I || I I I | || || | | I II I I I I I I I M I M I I M I M I I I I I I I I I I I I I N H I 11 1 11 
2049 CAGTTTATTAATATM 



3530 GAATTTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATC 3589 
i I i i I I I I I I I I I I i I M I I I I I I I M N M I I I I II I II II I I I I I I I I I I I I M M M 

2109 GAMTTTTACATCCTGAT 2168 

3590 TTGCCAAATTTTGAAACTACACACAAAAAGCATACTTGCA^ 3649 

i i i i M l l I I I I I I I I I I I I I I M I I II I I I I I I M I I M I I I I I M I I I I I I I I I I I I 

2169 TTGCCAAATTTTGAAACTACACACAAAA 2228 



3650 TTCAGTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAA 3709 

3650 TTCAGTGGC j , | , | , M M | , | | | | | | | | | | I I I I M I I I M I I I I I M I M I I I I I I M 

2229 TTCAGTGGCTTTTTA^ 2288 



3710 ACAATTATAATTTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACT 3769 
3710 ACAATTATAA ,,,,,, , , , , , , , , , | | , , | | | | , | | | | | | | | | I I I I I I 

2289 ACAATTATAATTTCTTTACATACT 2348 



3829 



TCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAA^ 

2408 



3770 TCAAAACA S - ^ . — ^ 7TT. I I I I I I I I M I I I I I I I I I I 
2349 TCjJw^CATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGAC 



3830 CATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCT^ 3889 

IIIIIIIIIIIIIMMIIMMMIIIMIIIIIIIIIIMMIIIIMIIIIMIIM 



Db 


2409 


CATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTA11A1 


2468 


Qy 


3890 


CCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCC 

Ml I Mil 1 Mil 1 Mill Mill MUM II II MM MM III Mill Mill IMM 

CCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGOt,^ 


3949 


Db 


2469 


2528 


Qy 


3950 


AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 

Ml Ml Ml Mill III M M 1 1 1 1 1 1 II M 1 1 M II II II M II M M M MM 

AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGIAA 


4009 


Db 


2529 


2588 


Qy 


4010 


TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATA 

! | M || | || | | | M M M II II 1 1 II M II 1 1 II M 1 1 II II 1 1 1 M 1 M 1 II 1 1 

TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCIAIA 


4069 


Db 


2589 


264 8 


Qy 


4070 


GTTACTGATTTTTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTA 

1 | || | | | || | | | M II 1 1 1 M II II 1 M 1 M 1 M III M M 1 1 M M 1 1 1 M 1 II 1 M M 

GTTACTGATTTTTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAALAACIA 


4129 


Db 


2649 


2708 


Qy 


4130 


CCTTATTTTTCACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAA 

1 1 | | | | | || I II 1 II M 1 1 M 1 1 M M 1 1 1 II II 1 1 II 1 1 1 1 II 1 1 

CCTTATTTTTCACTGTACAGACACTAATTCATTAAATACTAAT1 OAi lui l i/w^^r/w\ 


*± X O -7 


Db 


2709 


2768 


Qy 


4190 


T AT AAAT GT GACAAGT G GAC AT TAT T TAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAA 

1 i 1 M 1 1 1 1 N 1 1 1 M 1 1 M 1 M 1 1 M i 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 M M M M INN 

TAT AAAT GT GACAAGT GGACAT T ATT TAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAA 


4249 


Db 


2769 


2828 


Qy 


4250 


GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 4286 

| | | | I I 1 1 1 M 1 1 1 M 1 1 1 1 1 M 1 1 1 1 

GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 28 65 




Db 


2829 





RESULT 8 

AL139002/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



ALX39002 183337 bp DNA linear PRI 28-JAN-2001 

Human DNA sequence from clone RP11-318G21 on chromosome 
13q22. 2-31.1, complete sequence. 
AL139002 

AL139002. 18 GI: 12597038 
HTG. 

Homo s apiens ( human ) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Homimdae; Homo. 
1 (bases 1 to 183337) 
Wall,M. 

Direct Submission . 
Submitted (28- JAN-2001) Sanger Centre, Hinxton, Cambridgeshire, 
CB10 ISA, UK. E-mail enquiries: humquery@sanger.ac.uk Clone 
requests: clonerequest@sanger.ac.uk 

On Jan 29, 2001 this sequence version replaced gi:1258435b. 
During sequence assembly data is compared from overlapping clones. 
Where differences are found these are annotated as variations 
toqether with a note of the overlapping clone name. Note that the 
variation annotation may not be found in the sequence submission 
corresponding to the overlapping clone, as we submit sequences with 
only a small overlap as described above. 

This sequence has been finished according to sequence map criteria 



FEATURES 

source 



repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 
repeat 



as follows. An attempt is made to resolve all sequencing problems, 
such as compressions and repeats, but not necessarily within known 
annotated repeat sequence elements. Where the sequence is 
amSguous, there is an annotation using the 'unsure' feature key. 

folding abbreviations are used to associate primary accession 
numbers given in the feature table with their source databases 
EmT, EMBL; Sw:, SWISSPROT; Tr:, TREMBL; Wp:, WORMPEP; Information 
on the WORMPEP database can be found at 

http://www.sanger.ac.uk/Projects/C_elegans/wormpep This sequence 
was generated from part of bacterial clone contigs of human 
chromosome 13, constructed by the Sanger Centre Chromosome 13 
Mapping Group. Further information can be found at 
http://www.sanger.ac.uk/HGP/Chrl3 

RP11-318G21 is from the library RPCI-11.2 constructed by the group 
of Pieter de Jong. For further details see 
http : //www . chori . org/bacpac/home . htm 

™ q P »«" 3 l! the entire insert of clone KPH-318G2! The tr u e 

left end of clone RP11-267I18 is at 125528 in this sequence. 
Location/Qualifiers 
1. .183337 

/organism="Homo sapiens" 
/mol_type-="genomic DNA" 
/db_xref="taxon: 9606" 
/ ch r omo s ome= "13" 
/map="q22. 2-31.1" 
/clone="RPH-318G21" 
/clone_lib="RPCI-ll . 2 " 
3 454 

/note="LlMC5 repeat: matches 7127. .7575 of consensus" 
1216. .1308 

/note="HALl repeat: matches 1475. .1563 of consensus 
1309. .1597 

/note="AluJb repeat: matches 1. .297 of consensus 
1598. .2044 

/note="HALl repeat: matches 1003. .1475 of consensus 
2148. .2276 

/note="L2 repeat: matches 2620. 
2330. .2378 

/note="L2 repeat: matches 2442. 
3915. .4224 

matches 1 . 



region 



region 



region 



_region 
_region 
_region 
_region 



region 



region 



region 



_region 



region 



__region 



region 



region 



2 mer cc 61% conserved" 
4 mer cctt 78% conserved" 
4 mer tcct 78% conserved" 



/note="AluY repeat 
4617. .4750 
/note="67 copies 2 
4648. .4727 
/note="20 copies 4 
4729. .4784 
/note="14 copies 
5431. .5736 
/note="AluSx repeat 
11990. .12273 
/note="AluSx repeat 
12589. .12809 

/note="MIR repeat: matches 7 
13390. .13519 

/note="L2 repeat: matches 2410. 
14630. .14978 



.2749 of consensus" 
.2492 of consensus" 
,306 of consensus" 



matches 
matches 



. .305 of consensus" 
. .292 of consensus" 
.234 of consensus" 
.2548 of consensus" 



repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat__region 

repeat_region 

repeat_region 

repeater egion 

repeat_region 

repeat_region 

repeat_region 

repeat_r egion 

repeat_region 

repeat_region 

repeat_region 

misc_f eature 

repeat_region 

repeat_region 

repeat_region 

repeat_r egion 

repeat_r egion 

repeat_r egion 

repeat_r egion 

repeater egion 

repeat_r egion 

repeat_r egion 

repeat_region 



/note="THElB repeat: matches 1. .360 of consensus" 
15092. .15580 

/note="LlMBl repeat: matches 5656. .6116 of consensus 
15581. .16095 

/ not e="LlPA7 repeat: matches 5629. .6143 of consensus 
16096. .16549 

/note="LlMBl repeat: matches 5188. .5656 of consensus 
16731. .16777 

/note="MIR repeat: matches 35. .78 of consensus" 
16778. .17137 

/note="THElC repeat: matches 1. .371 of consensus" 
17138. .17273 

/note="MIR repeat: matches 78. .226 of consensus" 
17374. .17484 

/note="MIR repeat: matches 26. .158 of consensus" 
17485. .17777 

/note="AluSc repeat: matches 1. .290 of consensus" 
17778 17815 

/note="MIR repeat: matches 158. . 191 of consensus" 
18981. .19048 

/note="34 copies 2 mer tt 66% conserved" 
19447 . .19589 

/note="MIR repeat: matches 131. .262 of consensus" 
19843. .20162 

/note="MER33 repeat: matches 1. .324 of consensus" 
20866. .21198 

/note= M MER4 4 A repeat: matches 3. .333 of consensus" 
21742. .21878 

/note="MIR repeat: matches 9. .154 of consensus" 
22214. .22310 

/note="MIR repeat: matches 164. .260 of consensus" 
22321. .22418 

/note="LlMB8 repeat: matches 6078. .6171 of consensus 1 
22390. .22715 

/note="Sequence from AC018674 sequenced by WUGSC." 
22419. .22730 

/note="AluY repeat: matches 1. .311 of consensus" 
22731. .23714 

/note="LlMB8 repeat: matches 5130. .6078 of consensus 
23715. .24008 

/note="AluSg repeat: matches 1. .294 of consensus" 
24009. .24264 

/note="LlMB8 repeat: matches 4884. .5130 of consensus 
24265. .24569 

/note="AluY repeat: matches 1. .305 of consensus" 
24570. .25577 

/note="LlMB8 repeat: matches 3786. .4884 of consensus 
25582. .25635 

/note="27 copies 2 mer tt 70% conserved" 
26221. .26571 

/note="MSTA repeat: matches 1. .347 of consensus" 
26572. .26750 

/note="AluY repeat: matches 129. .307 of consensus" 
26752. .27066 

/note="AluY repeat: matches 1. .311 of consensus" 
27067. .27134 

/note="lVISTA repeat: matches 347. .371 of consensus" 



repeat__region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat__region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat__region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeater egion 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 



27963. .28006 

/note="22 copies 2 mer tt 75% conserved" 
28916. .29282 

/note="MER39 repeat: matches 13. .381 of consensus 
29282. .29517 

/note="MER39b repeat: matches 327. .579 of consensus 
30210. .30526 

/note="AluJo repeat: matches 1. .303 of consensus 
31423. .31572 

/note="LlPA13 repeat: matches 6005. .6155 of consensus 
31587. .31624 

/note="19 copies 2 mer tt 86% conserved" 
32103. .32181 

/note="ORSL repeat: matches 390. .467 of consensus 
33878. .34312 

/note="MER57A repeat: matches 1. .433 of consensus" 
36673. .36768 

/note="LTR37A repeat: matches 81. .172 of consensus 
36769. .37066 

/note="AluSq repeat: matches 1. .296 of consensus 
37067. .37300 

/note="LTR37A repeat: matches 172. .424 of consensus 
39470. .39501 

/note="16 copies 2 mer tt 90% conserved" 
41434. .42607 

/note="LlM4 repeat: matches -258. .888 of consensus 
42744. .43220 

/note="LlM4 repeat: matches 1085. .1580 of consensus 
43703. .44007 

/note="AluJb repeat: matches 1. .305 of consensus 
44019. .44180 

/note="LlMDl repeat: matches 6044. .6211 of consensus 
44183. .44485 

/note="AluJo repeat: matches 1. .300 of consensus 
44486. .44611 

/note="LlMD2 repeat: matches 5949. .6066 of consensus 
45256. .45430 

/note="MER5B repeat: matches 1. .178 of consensus 
45588. .45669 

/note="MER5A repeat: matches 109. .188 of consensus 
46919. .47084 

/note="MIR repeat: matches 94. .260 of consensus 
47873. .47902 

/note="15 copies 2 mer tg 90% conserved" 
49906. .49933 

/note="7 copies 4 mer tgtg 96% conserved" 
50452. .50507 

/note="LTR37A repeat: matches 128. .184 of consensus 
51786. .51829 

/note="MER74A repeat: matches 271. .309 of consensus 
51830. .52172 

/note="THElB repeat: matches 1. .364 of consensus 
52173. .52221 

/note="MER74A repeat: matches 221. .271 of consensus 
52710. .53396 

/note="LlMB7 repeat: matches 5451. -6171 of consensus 
53890. .53996 



/note="MIR repeat: matches 35. .152 of consensus" 
e P eat_region 54259.^.54296^ ^ ^ ^ ^ 

e P eat_region 54299.^.55519^^^ ^ ^ ^ 

Match 66.1%; Score 2841.8; DB 9; Length 183337; 

Local Similarity 99.9%; Pred. No. 0; 
:hes 2854; Conservative 0; Mismatches 2; Indels 1, Gaps 

14^ AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 14 89 
1430 ^ATGCTTAlbC , , , , , , , , , , , , , , , | , , | , , | | | | | | | | | | | | 

72830 AGTCATGCTTATGCTGCT 72771 
1490 AGTCGTGCTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAA 1549 

n i i i i I l l l l I I I I I I I I I I I I I I I I I I M I I I 1 1 I M I I I M I I I I M M I I I I I M I 

72770 AGTCGTGCTTAAAGTTC 72711 



1550 AATACAGCTCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATA^ 1609 

-2710 aatacagctcatcttgaaa^ 72651 

1610 GAAGTCATTAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATCT 1669 

111IM M M I ! I I! I I I I I I ! I II I I I I I M I M I II I i I I M M ! 

,2650 GAAGTCATTAA^^ 72591 

1670 CAGCACACTATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACAT 1729 

i i I I I I I I I i I I I I I I M | I I I M I I I I I M I I I I II I I I I I I M I I I M I I I I I M M I 

CAGCACACTATTAAAATATTAAGTGTAA^ 72531 



72590 w^--* 

1730 TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCCT 1789 

i m i i i i i i i M l l I I I I ! I II M I I I I I M M I M I M I M I I M M I M I M I 

72530 TTATGAGCTGTTTAC^ 72471 
1790 AAGCACTTAATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATT 1849 
1790 YY | | | | | ill I I | I I I | I I I I I I I I | | I M I I I I M I I II I I I I I I M M I I I I M II I I 
72470 AAGCACTTAATTTTTTA^ 72411 
1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATT^ 1909 

i i i i I I l I II I I I I II I I I I I I I I M I I M M I I I I I I I I N I I I I I I I M I I I N I I 

72410 CACACAACACTTAGGCTTAAAAA 72351 



1910 ATTTATTTTTAAATCAATGGGACTCTGATATAAAG^ ^69 

,, | | M | | | | | || I I | II I I I I | l| | | MM I I I I I I I I I II II 11 1 1 

1111111 'ii'Hilllllliiii^i^^^r.r.nArzATVTAAGTCACTGTAAAACAGA 72291 

72350 



l I I I I M I M II I I I I II I I I I M I I I I I I I I I II I I M I M II I M M I II I I M M M 
AT T T ATTT T T AAAT C AAT GGGA.CT CT GAT ATAAAG GAAGAAT AAGT CACT GT AAAAC AGA 



1970 ACTTTTAAATGAAGCTTAAATTACTCAATTTAAAATTTTAAAATC^ 2029 

i i i i i i I I I M I I I I I I I I I I I I I I I I I I I M M M I I I I I M I I I I M I M I I M I II 

1111111111 11 I LLiIlll:iiii^^^A B ^rpTAAAATr.CTTTAAAACAACTTT 72231 

72290 



I I | | I I I || M I I I II I M M M I I I I I I M M III I I I I I I M I I I I M I M I I I I M I 

acttttZtgaag^ 



2030 TCAATTAATATTATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAG^ 2089 
72230 TCAATTAATATTATCACACTATTA^ 72171 

72111 



2090 AGTTGTTGCATTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAG 2149 

M M I I I M I I I I I I M I I I I I I I I I 4iiiiiiiiiiiiiiUi),'Mi!iirir,AAAG 
72170 ~" " 



I I I I 1 I I II I I I I M M I I I I I I I I I I M M M I M II I I I M I I I I I I M I I II M IN 

AGTTGTTGCATTTTTCGGACACTGGAAACA 



2150 AGCAAGGCTGTTTTTGAAAATCATTACACTTTC ^209 

i i i I I I I I I I I i i I I I i I | | | | I I M I I I I M M I I I I N I I I I I M M I I M I i 

72110 AGCAAGGCTGTTTTTGAAAATCATTACACTTTC^ 



72051 



2210 



71991 



CAATATGTAACCAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACAT^ 2269 

72050 CAATATGTAACCAACATGTCAC 
2270 TGAATTTAAAATATAATACTTTTAAAAAGAAAATTATTACATCCTT^ 2329 

IIIIIIIIIIIMIIIIIIIIIIIIMIIIMIIIIMIIMIIIMIIIMIIIMMI 

71990 TGAATTTAAAATATAATACTTTTAAA 71931 

?330 ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 2389 
2330 ^CAAACCTCALAAA , , , , , , , , , , , , , , , , , , , , , | | | | | | | | 

71930 ATcZcCTCA^ 71871 
0^0 TCTGTCATTCACATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAA 2449 
2390 ^TGTCATTCACATACC , , , , , , , , , , , , | , , , | | | | M I I I II I I I 

71870 TCTGTCATTCACATACCCTGTGAA 71811 



2450 ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 2509 
2450 ATCTTC1 | i i | I I I | | | | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I H I I I I 
71810 ATCTTCTTCTTTCACTATCGTAG^ 



2510 ACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTAC 
2510 ACCTACA ,,,,,,,,, ,,,,,,,,,,,,,,,,,, | Ml I I Ml I I IN mi I IN ^ 



71751 
2569 



ACCTACATACACT 

SAT GGAGAGAT G C C AGT GAC CT C AT AAT AAAGACT GT GAACT GC CT GGT GC AGT GT C C AC 2629 

I M M I I I M I I I M I M I I M I I I I I M M I I I I I I I I I Ml IN IN ' ' ' ' i^ic 71631 



71750 
2570 

71690 GAT GGAGAGAT GCCAGTGACCT CAT A^ 
2630 ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 2689 

i i i i i i i i I I l M I I I I I I I I M I I I I I I M I I I I I I I I I I II I I I I I I II I I M I I I I I 

71630 ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGT 



71571 



2690 



ATATGTATAATGCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACA^ 2749 
71570 ATATGTAT^^ 71511 

2750 ^ < ^ : "^T^^ ^ T TT^r^^^ T T^ c i' , T^r^ r T T T T T T'^T 'T'T T ^^^^T^^T^^^ ^^^'T'^'T^T^T^ 2 7 8 ;/ 5i 

71510 ACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATT 71451 
2810 AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 2869 

YVTT 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 

71450 AAGTTTGCTTGACATC 

VACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTC1 
i i i i I I I I M I I I I I I I M I I I I M I I I M I I M I I I M M I I M I I 

713 90 a^gaIcctcttagctttgtgcgttcctgcctaatttttatatcttc^ 

2930 TAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGA^ 2989 

I I INI IN Mill I III II Ml II Ml Ml MM Mill M I I M M I II I M I I I I I 

71 330 T AGGAT AGCTT GGGAT GAGAT GT GTGT GAAAGTAT GTACAAGAGAAAACGGAAGAGAGAG 71271 



| | | | I M I M M I I M I I I iiiii^i^^^^^^^^rTTTTTGAGACCGT 71391 

2929 



2870 AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 



71211 



2990 GAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCT 3049 
71270 GZiGiGGiGGGGUGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTT 
3050 CGTCATTGCCTCGTCACATCAATGCAAAAGGTCCT 3109 

i i i I l l l I I I I I I I I I I I I I I I I I I I M M I I I I M I I I I I M I II I I I I I I I I I I I I I 

71210 CGTCATTGCCTCGTCA^ 71151 
3110 CAATGTTCTCAGAGTGACTTTCGAAATAAATTGGG^ 3169 

I l I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I IN 

CAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCC 

AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 3229 

i i i I l l l I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I N I I I I I I I I I I I I I I I I 

AATATGCCCAAATTTTTACTTT^ 71031 



71150 
3170 
71090 



70971 



^30 AC CT AGT AAT GT T GTTTT CT GT C AAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAA 3289 

3230 ^^^^ ||M ,,,,,, | ,,,,,, MiMMMiiiiiMIIIIIIIIIM 

71030 AGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAA 
3290 CAAT GT GGCCAGAAAGAAAGAGCAAT AATAATT AATTCACACACCAT AT GGATT CT ATTT 3349 

MIIIIIIIIIIIIIIMIMMMII IIIIIIIIIIIIIIMIIIIMIIIIIIIIMI 

70970 CAATGTGGCCAGAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTT 70911 
3350 ATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTT 3409 

, 7 ,,, , I I I | | | | | | | | | | | 1 I I I I I I I I I I I I I I I 1 I 1 I I I I I I I I I I I I I I M I I I I I 

70910 ATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTT 70851 



3410 ATCATAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAA 3469 

I I I I I I I I I I I I I II I I I I I I I M I II I I I I I M I I I I I I I I I I M I M I I I I I I I I I I I 

70850 ATCATAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCA 



70791 
3529 



3470 CAGTTTATTAATATATTTAATTTCTATTTAAATTTTAGATTATTT^ 

, , , , , I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

70731 



I I M I I I I I I I I I I I M I I M I I I I I M I I I I I I I I M I I I I I I I I I I I I I M M HI 
70790 cagtttIttaatatatttaatttctatttaaattttagattatttttat 



3589 



gaatttttacatcctgataccctttccttctccatgtcagtatcatgttctctaattatc 

70671 



3530 GAATTTT | | | | | | | | | | | | I I I I I M I I I I I I I I I I I I I I I ' ' ' ' 

70730 GAATTTTTACATCCTGATACCTTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATC 



3590 TTGCCAAATTTTGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCA 3649 
3590 T ^CCAAAT | | | | | | | | | | | | | | | | | | | | | | | I I I I I I I I I I M I II I I I I I I I I I I I 
70670 TTGCCAAATTTTGAAACTACACACAAAA 70611 

3650 TTCAGTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTA 3709 

I I I I I I I I I I I I I II I II | I I | | | | | | | | I I I II I I I I I I I I I N I 1 11 

70610 TTCAGTGGCTTTTT-AAA^ 70552 



3710 ACAATTATAATTTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACT 3769 
3710 ACAATTATAAT ,,,,, ,,,,,,,,,,,,, , ,| M , I I II I I I I I I I I I I IN 

70551 ACAATTATAATTTCTTTA^ 7049 



3770 TCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAA 3829 

3770 TCAAAACATGT ,,,,,,,,,,,,,,,,,,, | , | | | | | | I I I I 

70491 TCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAA 



3830 



CATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTAT 



70432 
3889 



Db 


70431 


Ml Ml IIIIIIMMIIMMIM MillllllMMIII 

CATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTA1 


70372 


Qy 


3890 


CCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCC 

|M | | | | | 1 M 1 II 1 1 1 1 I I I 1 1 1 1 1 I M 1 1 1 1 1 1 M 1 1 1 1 Ml 

CCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCC 


Z? T Zf 


Db 


70371 


70312 


Qy 


3950 


AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 

MM IIIIIIMIMM MM M II 1 II 1 M 1 M II Ml 

AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 


4009 


Db 


70311 


70? 


Qy 


4010 


TATAACAATGTAAAAAATTATATAT CTGGGAGGATTTTTTGGTT GCCTAAAGTGGCTATA 

! | | | | | | | || | | II 1 II M II M 1 1 M 1 II M II 1 1 II M 1 1 1 M II II II II M 

TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATA 




Db 


70251 




Qy 


4070 


GTTACTGATTTTTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTA 

1 1 | | | | | | || || | | | || II II II II 1 1 1 II II 1 1 1 M M 1 1 II M 1 II II II II 1 II M 1 

GTTACTGATTTTTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTA 


*i ± £. " 


Db 


70191 




Qy 


4130 


CCTTATTTTTCACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAA 

1 M 1 II 1 1 .1 II 1 M 1 M II 1 1 M 1 IMM Ml 1 III 1 IMM 1 1 ' 1 JJJJ^ 

C CT T AT T T T T C ACT GT AC AGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAA 


4 1 o y 


Db 


70131 


70072 


Qy 


4190 


T ATAAAT GT GACAAGT GGACATTATTT AT GTTAAAT ATACAATT AT CAAGCAAGTAT GAA 

( I i I 1 t 1 1 1 1 i 1 1 M 1 ! 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 I M ' ' ' ' ' i i 

TAT AAAT GT GACAAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT CAAGCAAGTAT GAA 


4249 


Db 


70071 


70012 


Qy 


4250 


GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 4286 

| M | I I II 1 1 i II 1 1 1 II 1 M 1 1 1 1 1 M M 1 1 i 1 1 M 

GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 69975 




Db 


70011 





RESULT 9 

LOCUS 750/C AC144750 201093 bp DNA linear HTG 04-JUN-2003 

DEFINITION Pan troglodytes clone CH251-517B22, WORKING DRAFT SEQUENCE, 3 

ordered pieces . 
ACCESSION AC144750 

VERSION AC144750 .2 GI : 31376422 

KEYWORDS HTG; HTGS_PHASE2; HTGS_DRAFT . 
SOURCE Pan troglodytes (chimpanzee) 

ORGANISM Pan troglodytes . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostom! ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pan. 
REFERENCE 1 (bases 1 to 201093) . 

AUTHORS Antonellis,A., Ayele,K., Beckstrom-Sternberg, S .M. , Bennamm, B. , 

Blakesley,R.W., Bouf f ard, G. G. , Brinkley,C, Brooks, S., Canaga,K., 
Chu,G., Coleman, B . , Coleman, H . , Engle,J., Granite, S., Guan,X., 
Gupta, J., Haghighi,P., Han, J., Hansen, N., Ho,S.-L., Hu,P., 
Hurle,B., Idol, J. R. , Karlins,E., Kwong,P., Laric,P., Lee-Lin, S. Q. , 
Legaspi,R., Maduro,Q.L., Maduro,V.B., Margulies,E.H. , Masiello,C, 
Maskeri,B., McDowell, J., Paguirigan, C . , Pearson,R., Portnoy,M.E. , 
Prasad, A., Reddix-Dugue,N. , Schandler, K. , Schueler,M.G. , Snan,K., 
Sison,C, Stantripop,S., Thomas , J . W. , Thomas, P.J. , Tsipoun,V., 
Vogt,J.L., Wetherby,K.D., Wiggins, L. , Young, A. and Green, E.D. 
TITLE NISC Comparative Sequencing Initiative 

JOURNAL Unpublished 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



2 (bases 1 to 201093) 
Green, E. D. 

Direct Submission 

Submitted ( 15-MAY-2003) NIH Intramural Sequencing Center, 8/1/ 
Grovemont Circle, Gaithersburg, MD 20877, USA 

3 (bases 1 to 201093) 
Green, E. D. 

Direct Submission 

Submitted (04-JUN-2003) NIH Intramural Sequencing Center, 8/1/ 
Grovemont Circle, Gaithersburg, MD 20877, USA 
On Jun 4, 2003 this sequence version replaced gi:30725907. 
Genome Center 

Center: NIH Intramural Sequencing Center 

Center code: NISC 

Web site: http://www.nisc.nih.gov 
Contact: nisc_zoo@nhgri.nih.gov 

Project Information 

Center project name: esg 
Center clone name: 517B22 

The sequence data in this record represents an 'enhanced 1 
version of a Phase 2 submission. Specifically, the indicated 
order and orientation of each sequence contig has been 
established using one or more of the following: read-pair 
data from individual subclones, overlaps with neighboring 
clones, alignment with available reference sequence (e.g., 
human), and/or confirmation by PCR testing. In addition, 
the sequence assembly is based on at least 8X average 
coverage in Q20 bases and has been reviewed to rule out 
gross misassemblies, the low-quality ends of sequence 
contigs have been trimmed away, and each base is associated 
with a Phrap-derived quality score. 

Summary Statistics 

Sequencing vector: plasmid; n/a; 100% of reads 

Chemistry: Dye-terminator Big Dye; 100% of reads 

Assembly program: Phrap; version 0.990319 

Consensus quality: 200649 bases at least Q40 

Consensus quality: 200775 bases at least Q30 

Consensus quality: 200836 bases at least Q20 

Insert size: 165000; agarose-fp 

Insert size: 200893; sum-of-contigs 

Quality coverage: 13.47x in Q20 bases; agarose-fp 

Quality coverage: 11.07x in Q20 bases; sum-of-contigs 



NOTE: This is a 'working draft 1 sequence. It currently 

consists of 3 contigs. Gaps between the contigs 

are represented as runs of N. The order of the pieces 

is believed to be correct as given, however the sizes 

of the gaps between them are based on estimates that have 

provided by the submittor. 

This sequence will be replaced 

by the finished sequence as soon as it is available and 

the accession number will be preserved. 

1 107546: contig of 107546 bp in length 
107547 107646: gap of unknown length 
107647 153000: contig of 45354 bp in length 
153001 153100: gap of unknown length 



* 153101 201093: contig of 47993 bp in length. 
FEATURES Location/Qualifiers 
source 1. .201093 

/organism="Pan troglodytes" 
/mol__type="genomic DNA" 
/db_xref="taxon: 9598" 
/clone="CH251-517B22 n 
/clone_lib="CH251" 

misc_feature 1. .107546 

/note="assembly_f ragment 

clone_end:T7 

vector_side : left" 
misc feature 83861. .201093 

/note="clone overlaps with GenBank Accession Number 

AC144499 clone RP43-108D19 (center project name esf ) " 
misc_feature 107647. .153000 

/note= " as sembly_f ragment" 
misc_feature 153101. .201093 

/note="assembly_f ragment 

clone_end: SP6 

vector_side: right" 

ORIGIN 



Query Match 64.9%; Score 2792.4; DB 2; Length 201093; 
Best Local Similarity 99.2%; Pred. No. 0; 

Matches 2838; Conservative 0; Mismatches 16; Indels 6; Gaps 


3; 


Qy 


1430 


AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 

| | | M 1 1 II 1 1 1 1 1 M 1 1 M 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 M 1 1 M 1 1 1 M 1 1 II 1 1 1 
AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 


1489 


Db 


69717 


69658 


Qy 


1490 


AGT C GT GCT T AAAGTT CAAAGCTAAT GAT C AC GGAT AT GACAACT T C C GT T C C AGT AAT A 

1 1 M 1 1 1 1 1 1 1 M 1 M M Ml 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 

AGT C GT GCT TAAAGT T CAAAGCTAAT GAT C AC GGAT AT GACAACTT C C GT T C CAGTAAT A 


1549 


Db 


69657 


69598 


Qy 


1550 


AAT AC AGCT CAT CT T GAAAGAAGAACT AT T CACT GT AT T T CAT TT T CT T TAT AT T GGAC C 

I I 1 I I I I I I I 1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 M M 

AAT AC AGCT CAT CT T GAAAGAAGAACT AT T CACT GT AT CT C AT TT T CT T TAT ATT GGAC C 


1609 


Db 


69597 


69538 


Qy 


1610 


G AAGT CAT T AAAACAAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAACT AT GT AT TT G CA 

MINI MMIIIIMIIIIIIIIMMIIIIIIIIIMIIIIIIMIIIIIM 

GAAGT CAT T AAAAC AAAAT GAAAC AT T T GC C AAAAC AAAAC AAAAAACT AT GT AT T T GC A 


1669 


Db 


69537 


69478 


Qy 


1670 


C AG C AC AC TAT T AAAAT AT TAAGT GT AAT TAT TT T AACACT C AC AGCT AC AT AT GAC ATT 

IIMIIIIIIIIIIIMIIIMIIIIIMIIIIIMIIIIIII 1 1 1 1 1 1 M 1 1 M 1 1 1 1 

CAGCACACT AT T AAAAT AT TAAGT GT AAT TAT T T T AACACT CAT AGCT AC AT AT GACAT T 


1729 


Db 


69477 


69418 


Qy 


1730 


T TAT GAGCT GT T T AC GG CAT GGAAAGAAAAT C AGT GG GAAT T AAGAAAGC CT C GT C GT GA 

I | | M 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 II II 1 1 MINI 

T TAT GAGC T GT T T AC G GC AT GGAAAGAAAAT CAGT GGGAAT T AAGAAAGC CT CAT C GT GA 


1789 


Db 


69417 


69358 


Qy 


1790 


AAGCACTTAATTTTTTACAGTTAGCACTTCAACAT AGCT CTTAACAACTTCCAGGAT ATT 

I | | || 1 1 1 1 1 1 1 II M 1 1 1 1 M 1 1 1 1 1 M 1 1 1 M II 1 1 1 1 MIMIIIMIIIII 

AAGC ACT T AAT T T T T T AC AGT TAG C ACT T CAACAT AGCT CT T AACAACT T CC AGGAT AT T 


1849 


Db 


69357 


69298 


Qy 


1850 


CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 

MIIIIMMMIIIIMIIIMIIIIMIIMIIMIlll 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 


1909 



69297 



CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTTTATTCTTTCTAAAAAGAG 69238 



1910 AT T TAT T T T T AAAT CAAT G GGACT CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAAC AGA 

I I I I I M M I I 1 MMMMMMMMMMMMMMIMMMIMM 

69237 AT TT ATTTTTAAAT CAAT GGGACT CT GAT AT AAAGGAAGAAT AAGT CACT GTAAAACAGA 



1970 
69177 

2030 
69117 



ACT T T T AAAT GAAG CT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T 

I t 1 1 I I I I I I 1 M 1 M 1 t t I I 1 I 1 1 1 I 1 I 1 I I 1 I I I I 1 I I I 1 1 I i t MINI 

AC T T TT AAAT GAAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T 
T CAATTAAT ATTAT CACACT ATTAT CAGATT GTAATT AGAT GCAAAT GAGAGAGCAGTTT 

I I I I I I I M II I I I I I I I M II I! MINIM I I I t I II I Ml 

T CAATTAAT AT TAT CACACT ATT AT CAGACT GTAATT AGAT GCAAAT GAGAGAGCAGTTT 



2090 AGTT GT T GCATTT TT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAG 

I M | || | | M II M M II I I II I I M M M I II Ml II I II I I I M M M II I I M M I 

69057 AGTT GTTGCATTTTTCGGACACTAGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAG 



1969 

69178 

2029 

69118 

2089 

69058 

2149 

68998 



2150 AGC AAG GCT GTT T T T GAAAAT C AT T AC A CT T T CACT AGAAGC C CAAAC CT CAGC AT T 2206 

I I I I I I I M M M M M IMMI I II I I II I I II I M I M II I M II II I M 

AGCAAGGCTGTTTTTGAAAATCATTACACTCCTTTCACTAGAAGCCCAAACCTCAGCATT 



68997 
2207 

68937 
2267 



CT GCAAT AT GT AAC CAACAT GT C ACAAACAAGCAGCAT GT AAC AGACT GGC AC AT GT GC C 

I | | | | || | || M M I M I M M I I I M M I II II II M M II M 

CT GCAAT AT GTAAC CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCC 



68938 
2266 
68878 
2326 



AG CT GAAT T T AAAAT AT AAT ACT T TTAAAAAGAAAAT T ATT AC AT C CT T T AC AT T CAGT T 
I | | | | | | | | | | | | | | | | | | I I I I I I I I I I I I I M I I II M I M I I I II M I M M 
68877 AGCT GAAT T T AAAAT AT AAT ACT T TT T T T A — AAAAT TAT T AC AT C CT T T ACAT T CAGT T 68820 



2327 AAGAT CAAAC CT CACAAAGAGAAATAGAAT GTTT GAAAGGCT AT CCCAAAAGACTTTTTT 

I | | | | | | | | M | || M I I I I I I M II I M II II I I M I I I M M I M II II I I M Ml 

68819 AAGAT CAAACCT CACAACGAGAAATAGAAT GTTT GAAAGGCT AT CCCAAAAGACTT CTTT 



2386 
68760 
2446 



23 87 GAAT CT GT CAT T CAC AT AC C CT GT GAAGACAAT ACT AT CT ACAAT TT T T T C AGGAT TAT T 

I I | I I I I I I I I I I I I I t I i I I I I I I M I I I I I I I I I I I I I I t I I 1 I I I 1 I 1 I M M I I I I 

GAAT CT GT CAT T CAC AT AC CCT GT GAAGACAAT ACT AT CT AC AAT T T T T T CAG GAT T ATT 68700 



68759 

2447 
68699 

2507 
68639 

2567 
68579 

2627 



AAAAT CTT CTTTTTT CACT AT CGTAGCTTAAACTCT GTTT GGTTTTGT CAT CTGT AAAT A 

I I I I I I I I I I I M M M II M II M II II M M I I II M M II II II I M M M I II M 

AAAAT CTTCTTCTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATA 
CTTACCTACATACACTGCAT GTAGAT GAT T AAAT GAGGGCAGGCCCT GTGCTCATAGCTT 

I I I I | | | | I I || I I II I II II M II I I M M M II M MMMMMMMMM 

CT T ACCT ACAT ACACT GCAT GTAGAT GAT T AAAT GAGGGCAGGC CCT GTGCT CAT AGCT T 
T AC GAT GGAGAGAT GC CAGT GAC CT C AT AAT AAAGACT GT GAACT G C CT GGT G CAGT GT C 

I | | | | | | | || I I I I I II I II I II M II II I I M M I M II M II I M II II M I I M I M 

T AC GAT G GAGAGAT G C CAGT GAC CT CAT AAT AAAGACT GT GAACT GC CT GGT GCAGT GT C 



CACATGACAAAGGGGCAGGTAGCACCCTCTCTCACCC AT GCT GTGGTT AAAAT GGT TTCT 

I I I I I I | M I M II M I I M I M II II M M I II I I II M M I II II I M 

68519 CACATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCT 



2687 AGCATATGTATAAT GCT ATAGTTAAAATACTATTTTT CAAAAT CATACAGATTAGTACAT 

MIM IMMI M I I M t I I I I I I I I M I I I I I I I i 1 I I I I I I I I I 

68459 AG CAT AT GT AT AAT GCT AT AGT T AAAAT ACT AT T T T T CAAAAT C AT ACAGAT T AGT AC AT 



2506 

68640 

2566 

68580 

2626 

68520 

2686 

68460 

2746 

68400 



2747 TTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATA 2806 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 liiiiiiiiiliiiiiiililili.Ull! 

68399 
2807 



TTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATA 68340 
GAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGAC 2866 



I I I I I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
68339 GAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGAC 68280 

2867 CGTAAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTG 2926 
I I I I | | | I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I 
68279 CGTAAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTG 68220 

2927 CCTTAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGA 2986 
I | MM || | I I Ml I I I I II I II I I II I I I I I I I Ml I I I I I MM I III II I I I I I I I I 
68219 CCTTAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGA 68160 

2987 GAGGAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAAC 3046 

I I I I I I I I M M M I I M I I I I II I I I I II I I I I II I I I I I I M I II I I I I I I I I I I I I I 
GAGGAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAAC 68100 



68159 
3047 

68099 
3107 

68039 
3167 



GTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACA 

IMMMMM | M I I I II I I I M II I I I II I I I I I I I II M II I I I II M I I I I 

GTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACA 

GTGCAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCT 
M ! I I I I I I I I I I I M I I I II I I M II I I I II I I II I I I I I I I II I I M I I I I I I I I I I I 

GTGCAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTGGGTCT 



TAAAATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAA 

| | | | M I II I I I M II I I I I I II I I I I I I I I I I I I I I I II I M I I I I I Ml MM 

67979 TAAAATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAA 



3227 ATAAGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACC 

M I I I I M | M II I I I M I II II II I II I M I II I I M I II I I I II I I I I II I II I I II I 

67919 ATAAGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACC 
3287 CAACAATGTGGCCAGAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTA 

M || | | | M I M I II I I M Ml I M I I II M I I M I I I M I I I I I M I MM II M II M 

C AACAAT GT GGCCAGAAAGAAAGAGCAAT AAT AATT AAT T CACACAC CAT AT GGAT T CT A 
TTTATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCT 

IMM I I M I I I I I I II I I I I I I I 1 I I I I II I I I I M IMMMMM 

TTTATAAATCACCCACAAACTTGTTTTTTAATTTCATCCCAATCACTTTTTCAGAGGCCT 
GTTATCATAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTT 

I I I I I I M I M I I M I I I I I I I II I II II I I I II II M I I I II II M II I I I I M I M I 

GTTA.TCATAGAAGACATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTT 
TCACAGTTTATTAATATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGT 



67859 

3347 
67799 

3407 
67739 

3467 
67679 

3527 



3106 

68040 

3166 

67980 

3226 

67920 

3286 

67860 

3346 

67800 

3406 

67740 

3466 

67680 

3526 



I M I II I M I II II M II I II I M M I I II M I II II II I II I I II II I II M I I II i . 

TCACAGTTTATTAATATATTTTATTTCTATTTAAATTTTAGATTATTTTTATTACCATGT 67620 



ACTGAATTTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATT 

I | || | || | | | | | || I I M II I I I II II I II II M I I M M I I I I I M I I M M M 

67619 ACTG 



3586 



AAT T T T T AC AT C C T GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GTT CT CT AAT T 67560 



Qy 3587 
Db 67559 



ATCTTGCCAAATTTTGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATT 

I | | | | | | | | | M M II I I I I I I I I I M I M I I I I I I I I I I I M 

AT CT T GC C AAAT T T T GAAACT AC ACACAAAAAGC AT ACT T G CAT TAT T T AT AAT AAAAT T 



3646 
67500 
3706 



Ov 3647 G CAT T C AGT GGC T T T T T AAAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GAT AAGT AA 

M MM I II IN M M II I M I II I I M I I I M II I I I M I I II I I II I I M I I 

Db 67499 GCATTC AGT GGCTTTTT-AAAAAAATGTTTGATTCAAAACTTTAACAT ACT GAT AAGT AA 67441 



Ov 3707 GAAAC AAT TAT AAT T T C T T T AC AT ACT C AAAAC C AAGAT AGAAAAAGGT G CT AT C GT T C A 

|| || || || I M II I I I I I M II I M I I I I M M I I 

GAAACAATTATAATTT CTTTACATACT CAAAACCAAGATAGAAAAAGGTGCTATCGTT CA 



Db 67440 
Qy 3767 



ACT T C AAAAC AT GT T T C CT AGT AT T AAG GACT T T AAT AT AG C AAC AGAC AAAAT TAT T GT 

I | || || M I I I I I II II I II I I M M I M II I N I II I II M I I I I I I II M M I II II I 

Db 67380 ACTTCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGT 



3766 
67381 
3826 
67321 
3886 



Ov 3827 T AAC AT GGAT GT T AC AG CT C AAAAGAT T T AT AAAAGATT T T AAC CT AT T TT CT C C CT T AT 

M M M I II I I IMIM Ml M II I I II I I I I II I I I I I I I 

Db 67320 TAACATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTAT 67261 



Ov 3887 TAT C C ACT G CT AAT GT GGAT GT AT GT T CAAAC AC CTT T T AGT ATT GAT AGCT T AC AT AT G 

Y | m II II II I I M I I II II I M II I II I I I I I M M II II I II II I II I II I N 

Db 67260 TAT C C ACT GCT AAT GT G GAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCT T ACAT AT G 

Ov 3947 GCCAAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTG 

I M II II II M I I I I I II I I M I II I M I I I I I II I I I I I I M I I I II II I II II I I II 

Db 67200 GCCAAAGGAAT ACAGTTTAT AGCAAAACAT GGGTAT GCT GTAGCTAACT TTATAAAACT G 

Ov 4007 T AAT AT AACAAT GT AAAAAAT TAT AT AT CT G GGAGGAT TTTTTGGTT GC CT AAAGT G GCT 

Y | | || M II I I M I II I II II I I II II II I II II I II M II M I I II I M 

Db 67140 TAATATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCT 

Ov 4067 AT AGT TACT GAT T T T T TAT TAT GTAAGC AAAAC C AAT AAAAAT T T AAGT TT T T T T AAC AA 

I I I M II M I I II I M I II I I I M I I I II II I M I I I M I I I I I I M II M I II II I I M 

Db 67080 AT AGT TACT GAT TT T T TAT TAT GT AAG CAAAAC CAAT AAAAAT T T AAGT T TT T T T AACAA 

Ov 4127 CT AC CT TAT T T T T C ACT GT ACAGACACT AAT T CAT T AAAT AC TAAT T GAT T GT T T AAAAG 

| | M II II M I II II M M I I I I I I I M II II II M I I M I I II I I I II II I I II 

Db 67020 CT AC CTT AT T T T T C ACT GT ACAGACACT AAT T CAT T AAAT AC TAAT T GAT T GT T T AAAAG 

Ov 4187 AAAT AT AAAT GT GAC AAGT G GAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAG C AAGT AT 

| | | M I II M M I I M I I I I I I I M M I I M M II I I I I I I i I I N M I M I I M 

Db 66960 AAAT AT AAAT GT GACAAGT GGACATT ATTT AT GTTAAATATACAATTATCAAGCAAGTAT 

Qv 4247 GAAGT TAT T CAAT T AAAAT G C C ACAT TTCTGGTCTCTGGG 42 8 6 

| I I II I II II I II I II II I I I I I I I I M I I I M I I 

Db 66900 G AAGT T AT T CAAT T AAAAT GCC ACAT TTCTGGTCTCTGGG 66861 



3946 

67201 

4006 

67141 

4066 

67081 

4126 

67021 

4186 

66961 

4246 

66901 



RESULT 10 

G06417 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 



G06417 2720 bp DNA 

human STS WI-7149, sequence tagged site. 
G06417 

G06417.1 GI:859662 

STS; STS sequence; primer; sequence tagged site 
Homo sapiens (human) 



linear STS 19-OCT-1995 



ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



Homo sapiens . 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 2720) 

Hudson, T. . 
Whitehead Institute/MIT Center for Genome Research; Physically 

Mapped ESTs 
Unpublished (1995) 

Contact: Thomas Hudson 

Whitehead Institute/MIT Center for Genome Research 
Whitehead Institute for Biomedical Research 
9 Cambridge Center, Cambridge MA 02142 USA 
Tel: 617 252 1900 
Fax: 617 252 1902 

Email : thudson@genome . wi . mit . edu 

Primer A: AT GGAGAGATGCCAGT GACC 
Primer B: TAG GC AGGAAC GCACAAAG 
STS size: 331 
PCR Profile: 

Presoak: 
Denaturation : 
Annealing: 56 degrees C 
Polymerization: 
PCR Cycles : 35 
Thermal Cycler: 
Protocol : 

Template: 10 ng 

Primer: each 5 pM 

dNTPs : each 4 nM 

Taq Polymerase: 0.025 units/ul 

Total Vol: 2 0 ul 



Buffer: 

MgCl2: 1 . 5 mM 
KC1: 50 mM 
Tris-HCL: 10 mM 
pH: 9.3 



FEATURES 

source 



STS 

primer_bind 
primer_bind 
ORIGIN 



Prepared with primer pairs derived from D90402 — Unigene. 
Location/Qualifiers 
1. .2720 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db xref="taxon: 9606" 

/map="710_D_4; 788_D_1; 795_F_4; 921__F_2; 940_GJ 
969_D_1" 
1005. .1335 
1005. .1024 

complement (1317 . . 1335) 



Query Match 60.7%; 
Best Local Similarity 96.0%; 
Matches 2610; Conservative 



Score 2610; DB 11; Length 2720; 
Pred. No. 0; 
0; Mismatches 110; Indels 0; Gaps 



1567 AAGAAG AACT AT T CAC T GT AT T T CAT T T T CT T TAT AT T GGAC C GAAGT CAT T AAAAC AAA 162 6 

| I i I I I I I I I I I II I I I I I I I I I I I I I I I I I I M 1 I I I I I I I I I I I I I M I I 
1 AAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTNNNNNNNN 60 

1627 AT GAAAC AT T T G C CAAAACAAAAC AAAAAACT AT GT AT T T GC ACAGC AC ACT ATT AAAAT 168 6 

I I I I II I 1 I I I I II I I I I I I I I I I I M I I 
61 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATGTATTTGCACAGCACACTATTAAAAT 120 

1687 ATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGTTTACGG 174 6 

|| | | || | || | | | I I I M I I I I I I I I I I I M I M I I I I I I I I I M I I I I I I i I I I I 

121 AT T AAGT GT AAT TAT T T T AAC ACT C AC AGCT AC AT AT GAC AT T T TAT GAGCT GT T T AC G G 180 

1747 CATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAATTTTTTA 1806 

| 1 | | | | | | | | | I I I || I II I I I I I I I I I I I I I I II I I I I I I I I M I II II I I I I I I M M 
181 CATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAATTTTTTA 240 

18 07 CAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACTTAGGCT 1866 

| M | I I I I I I I I I I I I M I M I I M I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
241 CAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACTTAGGCT 300 

18 67 TAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTAAATCAA 192 6 

M I I II I M I I I I I M I I I I M M I I M I I I M I I I I I I M M II 

301 TAAAAAT GAGCT C ACT C AGAAT T T CT AT T CTT T CT AAAAAGAGAT T TAT T T T TAAAT C AA 360 

1927 T G G GACT CT GAT AT AAAG GAAGAAT AAGT C ACT GT AAAAC AGAACT T T TAAAT GAAG CT T 1986 
| | | I M I M I I I I I I II I I I I I I I I I I I I M I I M I I M M I I I I I II I I II I I I I M I I 
361 TGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATGAAGCTT 420 

1987 AAAT TACT CAAT T T AAAAT T TT AAAAT CCT T T AAAACAAC T TT T CAAT T AAT AT TAT CAC 2046 
| | | | M I I M I I I I I I II I I I I I I I I I I I M I I I II I I I M I I I I I I I II I I I I I I I M I 
421 AAAT TACT CAATT T AAAAT T TTAAAAT C CT T T AAAACAAC T T T T CAAT T AAT AT TAT CAC 48 0 

2047 ACT AT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGC AGT TT AGT T GT T GCATTT T T C G 2106 
| | | | M I M I I I M I I I I I I I I I I I I M I II I I I II I I I II I I I I I I I I I I I I M I I I M 
481 ACT AT TAT C AGAT T GT AAT T AGAT G CAAAT GAGAGAG C AGT T T AGT T GT T GCAT TT T T C G 54 0 

2107 GAC ACT G GAAAC AT T TAAAT GAT C AGGAGGGAGT AAC AGAAAGAGCAAGGCT GT TT T T GA 2166 
| | | | | | | M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I M I I M I I I M I 
541 GACACT G GAAAC AT T TAAAT GAT CAGGAG GG AGT AAC AGAAAGAGCAAGGCT GTTT T T GA 60 0 

2167 AAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAACCAACAT 2226 

|| I I II I I I I I I I I I M | I M I I M I I M I I I I M I I I I I I I II I I I I I I I M I 

601 AAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAACCAACAT 660 

2227 GT CACAAACAAGCAGCAT GT AACAGACT GGCACAT GT GC CAGCT GAATTT AAAAT AT AAT 2286 
| | | I I I I I I I I I M I I I I I I M I I I I I I I M I I I II I I I I I I I I I M I II I I I M I I I I I 
661 GT CACAAACAAGCAGCAT GT AAC AGACT GGCACAT GT GC CAGCT GAATTT AAAAT AT AAT 72 0 

2287 ACT T T T AAAAAGAAAAT T AT T AC AT C CT T T AC ATT C AGT T AAGAT CAAAC C T CACAAAGA 234 6 
| | | M | | | || | I I I I I I I I I I II I I I II I M M M I I I I II I I I I I I I I I I I I I M I I I I 
721 ACT T T T AAAAAGAAAAT T AT T AC AT C C T T T AC AT T CAGT T AAGAT CAAAC C T CACAAAGA 78 0 

2347 GAAAT AGAAT GTTT GAAAG G CT AT C C CAAAAGACT T T T T T GAAT CT GT CAT T C ACAT AC C 24 06 
| | | M | | | | | M | M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
7 81 GAAAT AGAAT GTTT GAAAG GCT AT C C C AAAAGACTT T T T T GAAT CT GT CAT T C ACAT AC C 84 0 

2407 CT GT GAAGAC AAT ACT AT CT AC AAT T T T T T C AGGAT T AT T AAAAT CTTCTTTTTT CAC T A 2466 



I I I I I I I I I M M I I M II M I I I 1 I I I I I I II I 1 1 I I I I IMM 

CT GT GAAGACAAT AC TAT CT ACAAT T T T T T C AGGAT TAT T AAAAT CTTCTTTTTT C ACT A 
TCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACACTGCAT 

I I M Ml I I M I I M I I I I I I M I I I I I I I I M_M [[}_[ Hi! 11 1! Hi!, 1 ' ' ' 



901 
2527 

961 
2587 
1021 
2647 
1081 



TCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACACTGCAT 960 

GTAGAT GAT TAAAT GAGGGCAGGC C CT GT GCT CAT AGCTTTACGAT GGAGAGAT GCCAGT 2586 

MIMIIIIIIIMMII IMIIIIIIIIIMIIII 1IIMII 1 inon 

GTAGAT GAT TAAAT GAGGG C AG G C C CT GT GCT C AT AG CT T T AC GAT G GAGAGAT GC C AGT 1020 



GACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGGGCAGGT 2646 

M | I I I I I I | | M I I I I I I I M I I M I I I I I I I I I M I I I I I I I I I I M II II I M I II I 

GAC CT C AT AAT AAAGACT GT GAACT G C CT GGT G CAGT GT C CAC AT GACAAAGG GGCAG GT 108 0 
AGCACCCTCTCTCACCCATGCTGTGGTTAA7VATGGTTTCTAGCATATGTATAATGCTATA 2706 

IIIIIIIMIIIMIIIIMMMMIIIIIIIMIIIIIIIMIII MINIM 

AGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAATGCTATA 1140 



2707 GT TAAAAT ACT AT T T T T CAAAAT CAT ACAGAT T AGT ACAT T T AAC AGCT AC CT GT AAAGC 2766 

I | | | | | I I I I I I I I I M II I I II I II I Ml II II MINIMI III I I 

1141 GT TAAAAT ACT AT T T T T CAAAAT CAT ACAGAT T AGT AC AT T T AAC AGCT AC CT GT AAAG C 1200 



2767 
1201 
2827 
1261 
2887 
1321 
2947 
1381 



TTAT TACT AAT T T T T GT AT TAT T TT T GT AAAT AGC CAAT AGAAAAGTTT GCT T GAC AT G G 282 6 

I I I I I | | || II I II N I I I N II II II I N I II I I II I I N I I N II I II II N I 

TTAT TACT AAT T T T T GT AT TAT T T T T GTAAAT AGC CAAT AGAAAAGT T T GCTT GACAT GG 



TGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTTAGCTTT 

I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 1 I I I ) I I I I I 1 I I I I M I I N I I 

TGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTTAGCTTT 
GTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTTGGGATG 

I I I | M I I II II N N I II N I I I II M I M N N N I I I I I N II II I I N I I M M I I 

GTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTTGGGATG 
AGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT GG GGT T G 

I | | | | | | | | | || || N I II I I II I II M M M N N N M II I I I I N I 

AGATGT GT GT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT GGGGTT G 
GAGGAAAC C CAT GG GGAC AGAT T C CCAT T CT T AGCCT AAC GT T C GT CAT T GC CT C GT CAC 

I I ! M | I I | I || N N N I I I II I I II N M II I N I I II I I N I I I II N I M I I I I I I 

GPJS GAAAC C CAT G GGGAC AG AT T C C CAT T CT T AGC CT AAC GT T C GT CAT T GC CT C GT CAC 
ATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCAGAGTGA 

I | | | | | | N || II I I N II I I II I N M N I I II I I I I I I N II I I N I 

AT CAAT GCAAAAGGT C CT GAT T T T GT T C CAGCAAAAC ACAGT GCAAT GTT CT CAGAGT GA 



3127 CTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTT AAAAT AT GCCCAAATTTTT 

I | | | | | | || | | | | || II I I I N II N I I I II M N N I N I I II I II I I II II N I N N 

1561 CTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATAT GCCCAAATTTTT 



1260 
2886 
1320 
2946 
1380 
3006 
1440 
3066 
1500 
3126 
1560 
3186 
1620 



ACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATGTTGTTT 32 4 6 



3]_ q 7 _ 

"~l T I TT III I I I I I I M I I I I I I I I I I I I I MM M I II I M I M_l M M I MJJJ 
1621 " ----- - 



ACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATGTTGTTT 1680 



3247 T CT GT CAAT AT T GAAT GT GAT GGT AC AGT AAAC C AAAAC C C AAC AAT GT GGC C AGAAAGA 3306 

I | | | | | | | | | | | | | | | | I I II I II M II I I I M M I I II II I I M I I I I I I M I I I I I M 



1681 



T C T GT C AAT AT T GAAT GT GAT G GT AC AGT AAAC C AAAAC C C AAC AAT GT GG C C AGAAAGA 174 0 



3307 AAGAGCAATAATAATTAATT CACACACCATAT GGATTCTATTTAT AAATCACCCACAAAC 3366 

| I I , , | | | | | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I II I I 
1741 AAGAGC AAT AAT AAT T AAT T C AC ACAC CAT AT GG ATT CT AT TT AT AAAT CAC C CACAAAC 1800 



3367 
1801 
3427 
1861 
3487 
1921 
3547 
1981 
3607 
2041 
3667 
2101 
3727 



TTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGTCATTTT 3426 

I | | | | | | | M | | | I I I I I I ■ I I I I I I t I I I 1 I MINIUM 

TTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGTCATTTT 1860 
AGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAATATATT 3486 

AGACTCTCAATTTTAA^ 1920 

TAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACATCCTGA 354 6 

| | | I I I II I I II II I I II N M II I I II 
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTACCATGTACTGAATTTTTACATCCTGA 198 0 

TACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTTTGAAAC 3606 
TACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTTTTTAAA 3666 

I I M II I I I II I I I I I I I I I II I I I I M I M I I I I I I I I I M I I I I I I M I 

TACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTTTTTNNN 2100 



I | | | M I I I II II II I I I I I II I I M I I I I II M I M II I I I I M I II II I I I I M II i . 

TACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTTTGAAAC 2040 



AAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAATTTCTTT 

I | m | | | M | | || I I I I II II I I I I I I I I I N I I I I I N I I I I N I I I I I M II I 

NNNNNTGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAATTTCTTT 
ACATACT CAAAACCAAGAT AGAAAAAGGTGCT ATCGTT CAACTT C AAAACAT GTTT CCTA 



3726 
2160 
3786 



MINIMUM I I I I N II I I I I II I M I I I I I I I I I I I I M I I I N I I I I I 

2161 ACATACT CAAAACCAAGAT AGAAAAAGGTGCTATCGTT CAACTT CAAAACAT GTTTC CTA 2220 



3787 GTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTACAGCTC 

| | | | | | | | | | || | | | I II II II I II I I I I I N I I N I M I M I I I M II I I M II I I I II 
2221 GTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTACAGCTC 

3847 AAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAATGTGGAT 

M I I I I I I I I I | | | | I I I I I I I I M M M I II I I I M II I I I I I 

2281 AAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAATGTGGAT 

3907 GTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACAGTTTAT 

|| | | | || M I I I I I I I M II MINIM M I II I I I N II II I I I 

2341 GTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACAGTTTAT 



3846 
2280 
3906 
2340 
3966 
2400 



AGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGTAAAAAA 4 026 

I M | | | M I II I I I I N I M M II I I I II I I M I II I I M I II M I M I M I I II I I I N 

AGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGTAAAAAA 24 60 



4 027 TTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTTTTTATT 4086 



I I II I I I I 



|| | | | || I I I I II I I I I II I I M I I I I II I I I I I I N I I II M I II I 



2461 TTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTTTTTATT 2520 

4087 ATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTCACTGTA 4146 

; | | | | || || | | | | || | | | M II I M I I I I I I II I M M I I I I I I M N I I I N I I M I M 
2521 ATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTCACTGTA 2580 



GAC ACT AAT T CATT AAAT ACT AAT T GAT T GTTTAAAAGAAAT AT AAAT GT GACAAGT G 



Db 2581 CA< 

4207 GACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATTAAAATG 4266 

: z - 

Ov 4267 CCACATTTCTGGTCTCTGGG 4286 

Y M M I I I I I M I I M I I I M 

Db 2701 CCACATTTCTGGTCTCTGGG 2720 
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AC130785 169751 bp DNA linear HTG 29-AUG-2002 

Papio anubis clone RP41-325P5, WORKING DRAFT SEQUENCE . 

AC130785 

AC130785.1 GI:22218455 

HTG; HTGS_PHASE2; HTGS_DRAFT. 

Papio anubis (olive baboon) 

Papio anubis vertebrata; Euteleostomi; 

Eukaryota; Metazoa; Chordata; Craniata, verteor ' . 
Marrmalia; Eutheria; Primates; Catarrhim; Cercopithecidae, 
Cercopithecinae; Papio. 

1 (bases 1 to 16^51) Beckstrom-Sternberg, S .M. , 

Akhter,N., Antonellis, A. , Ayexe,*., dco^ . . , c 

Ben ja mln,B., Blakesley, R. W . , Bouf fard, 6. G. , Breen ,K Brmkley, C. , 
Brooks,S., Dietrich,NL., ^^3' ^ lins , E . , 

raS?^;^ 

MarguIie^E.H., Masiello,C, Maskeri, B . , ^ aS p^^g^° ^" ' 

Sop,3:, Thomas, J.W., Thomas, P. J., ^ h ^\^ '^1' ^ ' ' 
Wetherby,K.D., Wiggins, L., Young,A., Zhang,L.-H. and Green,E.D. 
NISC Comparative Sequencing Initiative 
Unpublished 

2 (bases 1 to 169751) 
Green, E. D. 

SubmfttedTl4-AUG-2002) NIH Intramural Sequencing Center, 8717 
Grovemont Circle, Gaithersburg, MD 20877, USA 

3 (bases 1 to 169751) 
Green, E. D. 

Sub m ftted b T29-AUG-2002) NIH Intramural Sequencing Center, 8717 
Grovemont Circle, Gaithersburg, MD 20877, USA 

Genome Center 

Center: NIH Intramural Sequencing Center 
Center code: NISC 

Web site: http://www.nisc.nih.gov 
Contact: nisc_zoo@nhgri.nih.gov 

Project Information 

Center project name: deh 
Center clone name: 325P05 



The sequence data in this record represents an 'enhanced 1 
version of a Phase 2 submission. Specifically, the indicated 
order and orientation of each sequence contig has been 
established using one or more of the following: read-pair 
data from individual subclones, overlaps with neighboring 
clones, alignment with available reference sequence (e.g., 
human), and/or confirmation by PCR testing. In addition, 
the sequence assembly is based on at least 8X average 
coverage in Q20 bases and has been reviewed to rule out 
gross misassemblies, the low-quality ends of sequence 
contigs have been trimmed away, and each base is associated 
with a Phrap-derived quality score. 

Summary Statistics 

Sequencing vector: plasmid; n/a; 100% of reads 
Chemistry: Dye-terminator Big Dye; 100% of reads 
Assembly program: Phrap; version 0.990319 
Consensus quality: 169735 bases at least Q40 
Consensus quality: 169747 bases at least Q30 
Consensus quality: 169749 bases at least Q20 
Insert size: 138000; agarose-fp 
Insert size: 169751; sum-of-contigs 
Quality coverage: 10.99x in Q20 bases; agarose-fp 
Quality coverage: 8.94x in Q20 bases; sum-of-contigs 

* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 1 contigs. Gaps between the contigs 

* are represented as runs of N. The order of the pieces 

* is believed to be correct as given, however the sizes 

* of the gaps between them are based on estimates that have 

* provided by the submittor. 

* This sequence will be replaced 

* by the finished sequence as soon as it is available and 

* the accession number will be preserved. 

* 1 169751: contig of 169751 bp in length. 
FEATURES Location/Qualifiers 

source 1 • .169751 

/organism="Papio anubis" 
/mol_type="genomic DNA" 
/sub__species=" anubis" 
/db_xref="taxon: 9555" 
/clone="RP41-325P5" 
/clone_lib="RP41" 

misc_feature 1. - 169751 

/note="assembly_f ragment 

clone_end:T7 

vector_side : left 

clone_end: Sp6 

vector__side: right" 

misc feature 1. .63149 

/note="clone overlaps with GenBank Accession Number 
AC129069 clone RP41-240D13 (center project name deg) 

ORIGIN 

Query Match 59.3%; Score 2550; DB 2; Length 169751; 

Best Local Similarity 94.8%; Pred. No. 0; 

Matches 2717; Conservative 0; Mismatches 130; Indels 19; Gaps 



1430 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 1489 
I I I I I I I I I | I I I I I I II I I I I I I 1 I I I I I M I I I I M I I M I I I I I I I M I I I I I I I M 
29218 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 29159 



1490 
29158 

1550 
29098 

1610 



AGTCGTGCTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATA 

I, HUM Mill IIIIIIIIIIIMIIIMIIIIIIIIIIIIIIIMI 

AGTCGTGCTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATA 

AATACAGCTCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGAI 

| | | | | | | | | M | | | | | I I I I I I I I II I I I I I I I I I I I I I I I II I I I I Ml 

AATACAGCTCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACC 



GAAGT CAT T AAAAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAACT AT GT AT T T G C A 

| | | | | | | | | M | | | I I II I II I I I I I I I I I I I I I I I I I I I I I I I I HI 

29038 GAAGTCATTAAAACAAAATGAAACATTTGTCAAAACAAAACT^AAAAACTATGTATTTGCA 



1670 
28978 

1730 
28918 

1790 
28858 



1549 
29099 
.CC 1609 
29039 
1669 
28979 
1729 



CAGCACACTATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATT 

I I I I I M I I I I I I I I I I I M I I M I I I I I II II I I I II I I I I I I I I I I I M I I II I I I I 

CAGCACACTATTAAAATATTAAGTGTAATTATTTTAACACTCATAGCTACATATGACATT 28919 
TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGA 1789 

Ml Ml I IN Mill I I M II M M I I I I I M I I I M M I I I M II I I I M I I MM 

TTATGAGCTGTTTACAGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCATTGTGA 28859 

AAGCACTTAATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATT 1849 

M | II II M I II I M I I I I I I I I I I I I M I I I I II I M I I I I I I I M I I I 

AAGCACTTACTTTTTTATGGTTAGCACTTCAACATAGCTCTTAATAACTTCCAGGATATT 28799 

1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 1909 

I | M II II II I I I I II M M I I II I II M I I I I II I I I M II I I ''Ml 
28798 CACACAACCCTTAGGCTTAAAAATGAGCTCACTCGGAATTTCTATT TAAGAG 28747 

1910 ATTTATTTTTAAATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGA 1969 
M I M I I II I I I I I I I M I M I Ml III Mill I Ml I I I I I I I I I I II I I I I I I I 
28746 AT T T ATTTT T AAAT C AAT GT GAAT CT GAT AC AAAGGAAGAGT AAGT C ACT GT AAAACAGA 28687 

AAATGAAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTT 2029 



197 0 ACTTTT - 

iTlT I I I I I I I I II MINI Ml I I MUM I Ml II III I III IN 

28686 ACTTT 



TAAATGAAGCTTAAATTACCCAATTTGAAATTTTAAAATCCTTTAAAAGAACTTT 28627 



2 030 TCAATTAATATTATCACACT-ATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTT 2088 
I | || | | | | | II I I II II I I M I I I I II I I II II II II I I II II MM 

28626 TT AATTAAT ATTTTC ACACTGCT GATCAGACT GTAAT TAGAT GCAAAT GAGAGAGT AGTT 28567 
2089 TAGTTGTTGCATTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAA 2148 

I | | | || || | | | I II II I II I I I II I II I I II I I I I I I II I I I I I 

2 8566 TAGTTGCTGTATTTTTTGGACACTAGAAACATTTAAATGATCAGGAGGGAGTAACTGAAA 28507 

2149 GAGCAAGGCTGTTTTTGAAAATCATTACA CTTT CACTAGAAGCCCAAACCT CAGCAT 2205 

M | | || | M II I M II II I II M I M M IMMMIIMIMIMMIMMIMI 

GAACAAGGCTGTTTTTGAAAATCATTACACTCCTTTCACTAGAAGCCCAAACCTCAGCAT 28447 



28506 

2206 T CT G C AAT AT GT AAC C AAC AT GT C AC AAAC AAGC AG CAT GT AAC AGACT GGCAC AT GT GC 

i M M ii 1 1 ii ii i ii 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 m i ii i LILLU '!!!!! 

28446 



2265 



T CT G C AAT AT GT AAC C AAC AT GT T ACAAAC AAGC AG CAT GT AACAAACT G GCAC AT GT GT 28387 



2266 CAGCTGAATTTAAAATATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGT 2325 
28386 CAGCCAAATCTAAAATATAATACTTTTAAAAAGAAAATTATTACACCCTTTACATTCAGA 28327 
2326 TAAGATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTT 2385 

i i i l I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I N 

28326 TAAGATCAAACCTCAC 28267 
2386 T GAAT CT GT CAT T CAC AT AC C CT GT GAAGAC AAT ACT AT CT AC AATTTT T T C AGGAT 2445 

mm i in i Minn i mi i iii ii mini mm i m 1 1 1 1 1 1 1 1 m i 

28266 TGAATCTGCCATTCACACAGCCTGTGAAGAAAATACTATCTACAAATTTTTCAGGATTAT 



28207 
2505 



244 6 TAAAATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAAT 

I I I I II M M M I I I Ml III I I M I I I I I M I I I M I I I M I I I III II I II I I I I 
28206 TAAAATCTTCTTCTTTCACTATTGTAGCTTAAACTCTGTTTGGTTTTGTCATCCGTAAAT 28147 



2506 ACTTACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCT 

MMI M I M I M I I I M M M I I I I I MUM IIIMMII llllll I 

28146 ACTTAGCTACATACACTGCATGTAGACGATTAAACGAGGGCGGGCCCTGTGTTCATAGTT 

2566 TTACGATGGAGAGATGCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGT 

MM II II M I II II M I I I II I I I I II I I I I I I I I I I I II I I II I M I I II I III 
28086 TTACAATGGAGAGATGCCAGTGACCTCATAATAGAGACTGTGAACTGCCTGGTGCGATGT 



2565 
28087 
2625 
28027 



CCACATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTC 2685 



2626 YYTTTTTTT 1 1 TTTTmTTTTT iTi i 1 1 1 1 m 1 1 1 n 1 1 1 1 1 1 n 1 1 n ii i , 

28026 CCACATGACAAGGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTAGTTAAAATGGTTTC 
2686 TAGCATATGTATAATGCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACA. 

mi mm i mini i minni mi iiii ii i i ii i i i i m i i i i i i m i i 

27966 TAGCATATGTATAATGCTGTAGTTAAAACACTGTTTTGCAAAATCATACAGATTAGTACA 



2746 TTTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAAT 
27906 TTTAATGGCTACCTGTAAAGCTTATTACTAGTTTTTGTATTATTTTTGTAAATAGCCAAT 



27967 
2745 
27907 
2805 



Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I II II M I I I M M I I 
11111 .. . I ■« ^^.iTim-j,mmTi/-rn ArirrTTTrpi^TiaTTaTTTTTGTAAATAGCCAAT 27847 

2865 



2806 AGAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGA 

1 1 1 1 1 1 1 1 m m i n i n m 1 1 n 1 1 1 1 1 1 1 m i n 1 1 1 1 1 1 1 m n n i n i 

27846 AGAAAAGTGTGCTTGACGTGGTGCTTTTCTTTCACTTAGAGGCAAAACTGCTTTTTGAGA 27787 
2866 CCGTAAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGT 2925 

I 1 1 ii 1 1 m 1 1 1 1 1 m i ii 1 1 1 1 ii 1 1 1 n 1 1 1 n I I I I I I I I II I I II I I I I I I I n 

27786 cTGtUgAAC^ 27727 
2926 GCCTTAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAG 2985 

MMI II I I I I I II I I I II II I II I M M I I M I M I II I I I I I I I I 



27726 



GCCTTAGAATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAG 27 667 



2986 AGAGGAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAA 

, M i,i. MMM iii i mim mi 1 1 iii mi inn ii mm mi 

27 666 AGAGGAAATGAGGTGGGGTGAGAGGAAACTCATGGGGACAGATTCCCATTCTTAGCCTAA 
3046 CGTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACAC 

M M I I I M I I I M M II I I M M I II I M M I II I I II I I M M I I M II II I I M I M 

27 606 CGTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACAC 



3106 



AGTGCAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTC 



3045 

27607 

3105 

27547 

3165 



1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 ii I M i mmmmimmiiim mil 

27546 AGTGCAATGTTCTCAGAGTGACTTTAGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTC 
3166 TTAAAATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGA 

Mlllllllll I I I I I I 1 mi 

27486 TTAAAATATGCCCAAATTTTTACTTTTTTTTTCTTTTAGTAAACTGGGCCACATGTTGGA 



27487 
3225 
27427 
3285 



322 6 AAT AAG C T AGT AAT GT TGTTTTCTGT C AAT AT T GAAT GT GAT GGT AC AGT AAAC C AAAAC 

| | || I II I Ml Ill I II II II II II I I I I I I I I I I I I I MM I III M 

27426 AAT AAG C T AGT AAT GT T GT T T T CT GT C AAT AT C GAAT GT GAT GGT GC AGT AAAC C AAAAC 27367 

3286 CCAACAAT GT GGC CAGAAAGAAAGAGCAATAATAATTAATT CACACACCAT AT GGATT CT 3345 

I M | | M I I I I M M II II M I I II I I I M M I I I I M I M M 

27366 C C AAC AAT GT G GC C AGAAAGAAAGAGC AAT AAT GAT T AAT T C AC AT G C CAT GT GGAT T CT 27307 

3346 AT T T AT AAAT CAC C C ACAAACT T GTT CT T T AAT T T CAT C C CAAT C AC T T T T T C AGAGGC C 3405 

|| | | | | | || | | || | || I I I I I II I M II I I I II M II I II I I I I M I M II 

27306 AT TT AT AAAT CAC C CAC AAAC T T GT TT T T T AAT T T CAT C C CAAT CAT T T T T T C AGAGGC C 27247 

3406 T GT TAT C AT AGAAGT CAT T TT AGACT CT CAAT T T T AAAT TAATT T T GAAT C ACT AAT AT T 3465 

I I I I M I I II I I I I MMIMIMI I 1 I I 1 I I I I I I 1 I 1 IMMM Mill 

2724 6 T GT TAT CAT AGAAGACAT T T T AGACTT GCAAT T T T AAAT TAACT T T GAAT C ACT AAT AT T 27187 
3466 T T CAC AGT T TAT T AAT AT A- T T T AATT T CT AT T T AAAT T T T AGAT T ATT T T T ATT AC CAT 3524 

Mill M I I I II I M I II II II I II I I I II M I M M I M I II I I I I M I I I 

T T C ACAGT T TAT T AAT AT AT TT T TAT T T CT AT T TAAAT T T T AGAT TAT T T T T ATT AC CAT 



27186 
3525 
27126 



GT ACT GAAT T T T T AC AT C CT GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GT T CT CT AA 

| | | I I II M II I M I I I I II I M II I I I II I I I II M I I I I I II I M I I I Ml 

GT ACT GAAT T T T TAT AT C CT GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GTT CT GT AA 



3585 T TAT CT T GC CAAAT T TT GAAACT AC ACACAAAAAGCAT ACT T GC AT TAT T T ATAAT AAAA 

IMMM I I II I I I I M I I M I I I M II I II I M I I I M M I I I I II M II I M I I II 

27 066 T TAT CT T AC CAAAT T TT GAAACT GC AC ACAAAAAGCAT ACT T GCAT T AT T TAT AAT AAAA 



27127 
3584 
27067 
3644 
27007 
3704 



3645 T T GCAT T C AGT GG CT T T T T AAAAAAAAT GT T T GATT CAAAACT T T AACAT ACT GAT AAGT 

|| I MUM MM M II I II I I II I I I II M II I IMMM II IN Ml IN 

27 006 T T G CAT T C AGT GGCTTTTT - AAAAAAAT GT TT GAT T CAAAAT T T T AACAT ACT GAT AAGT 2 694 8 



3705 AAGAAACAAT TAT AAT T T CT T T ACAT ACT CAAAAC C AAGAT AGAAAAAGGT GCT AT C GTT 

M II I I II II II M I II M I II M I II II M I M M II II I II II II M I II 

26947 AAGAAACAATAATAATTTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATTATT 



3764 
26888 



3765 CAACT T CAAAAC AT GT T T C CT AGT AT T AAGGACT TT AAT AT AGC AAC AGACAAAAT TAT T 3824 

|| | || | || | || | | II II I I II II II I I II I I I I M I II Ml 

T AACTT CAAAAC AT GT T T C CT AGT AT T AAGAACT T T AAT AT AGCAACAGACAAAAT TAT T 



26887 
3825 



GTTAACATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTT 

I || II II II II I I M II II I II II I M I M I I M II II II M M M II 

26827 GTTAACATGAATGTTACAGCTCAGAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTT 



26828 
3884 
26768 
3944 



3885 ATTATCCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATA 

Ml MM I II II I I II II III I I III II III II I MUM I Ml 

267 67 AT TAT C C ACT G CT AAT GT GGAT AT AT GT T C AAAC AC C T T T T AGT AT T GAT AGCT T AC AT A 2 6708 



3945 TGGCCAAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAG 4004 
|| I II II II I I M M II I I II I M M M II I M II I M I M II II II II II II I M 



Db 



26707 T G G C CAAAGGAAT AC AGT T T AT AGT GAAACAT GGGT AT ACT GT AGCT AACT T T AT AAAAC 26648 



Qv 4005 TGTAATATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGG 4064 

Mill Ill I I M I I IIIIIIIIIMIII II II I II I 

Db 26647 TGTAATATAACAATGTAAAAAATTATATACCTGGGGGGATTTTTTGGTTGCTTAAAGTGG 26588 

Qy 4065 CT AT AGT TACT G A- T T T T T TAT TAT GTAAGC AAAAC C AAT AAA AATTTAAGTTTTT 4119 

| | | | | | | I I I I I I I II I I I I I II I I I I I I I M I I I I I I > I 1 I M I 

Db 26587 CTATAGTCACTGATTTTTTTATTATGTAAGCAAAACCAATAAACTTTAGGTTGTGTTTTT 26528 

Qy 4120 T T AACAAC T AC CT TAT T T T T C AC T GT AC AGAC ACT AAT T CAT T AAAT AC T AAT T GAT T GT 4179 

| M I I II I II I I I I I I I I M I I I M I I I M I I I I M I I I I I II I I I III 

Db 26527 TTAACAACTAGCTTATTTTTCATTGTACAGGCACTAATTCATTAAATACTAATTGACTGT 26468 

Qv 4180 T TAAAAGAAATATAAAT GT GACAAGT GGACATT ATTTAT GTT AAAT AT AC AAT TAT CAAG 4239 

| | | M | | I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I 

Db 2 64 67 T T AAAG G AAAT AT AAAT GT GACAAGT GGAC ACT AT T TAT GT T AAAT AT AC AAT CAT CAAG 26408 

Qy 4240 CAAGTATGAAGTTATTCAATTAAAATGCCACATTTCTGGTCTCTGG 4285 

| | || | | | I || II I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I 
Db 26407 GAAGTATGAAGTTATTCAATTAAAATGCCACATTTCTGGTCTCTGG 26362 
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AC129069 185870 bp DNA linear HTG 19-SEP-2002 

Papio anubis clone RP41-240D13, WORKING DRAFT SEQUENCE. 

AC129069 

AC129069.2 GI: 23196382 
HTG; HTGS_PHASE2; HTGS_DRAFT. 
Papio anubis (olive baboon) 
Papio anubis 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae; 
Cercopithecinae; Papio. 

1 (bases 1 to 185870) 

Akhter,N., Antonellis , A. , Ayele,K., Beckstrom-Sternberg, S .M. , 
Benjamin, B., Blakesley, R. W . , Bouf f ard, G. G. , Brinkley,C, Brooks, S., 
Cariaga,K., Coleman, B., Dietrich, N . L . , Granite, S., Guan,X., 
Gupta, J., Haghighi,P., Han, J., Hansen, N., Ho,S.-L., Idol,J.R>, 
Karlins,E., Laric,P., Lee-Lin, S . -Q . , Legaspi,R., Maduro,Q.L., 
Maduro,V.B., Margulies , E . H . , Masiello,C, Maskeri,B., 
Mastrian, S.D. , McCloskey, J. C . , McDowell, J., Paguirigan, C . , 
Pearson, R. , Portnoy,M. E. , Prasad, A., Reddix-Dugue, N . , 
Schueler,M.G. , Sison f C, Stantripop, S . , Thomas, J. W., Thomas, P. J., 
Touchman, J.W. , Vogt,J.L., Walker, M., Wetherby, K. D . , Wiggins, L., 
Young, A. , Zhang, L.-H. and Green, E.D. 
NISC Comparative Sequencing Initiative 
Unpublished 

2 (bases 1 to 185870) 
Green, E.D. 

Direct Submission 

Submitted (25-JUL-2002) NIH Intramural Sequencing Center, 8717 
Grovemont Circle, Gaithersburg, MD 20877, USA 

3 (bases 1 to 185870) 
Green, E.D. 

Direct Submission 



JOURNAL Submitted ( 19-SEP-2002 ) NIH Intramural Sequencing Center, 8717 
Grovemont Circle, Gaithersburg, MD 20877, USA 
COMMENT On Sep 19, 2002 this sequence version replaced gi: 21955004. 

Genome Center 

Center: NIH Intramural Sequencing Center 
Center code: NISC 

Web site: http://www.nisc.nih.gov 
Contact : nisc_zoo@nhgri . nih . gov 

Project Information 

Center project name: deg 
Center clone name: 240D13 



The sequence data in this record represents an 'enhanced 1 
version of a Phase 2 submission. Specifically, the indicated 
order and orientation of each sequence contig has been 
established using one or more of the following: read-pair 
data from individual subclones, overlaps with neighboring 
clones, alignment with available reference sequence (e.g., 
human), and/or confirmation by PCR testing. In addition, 
the sequence assembly is based on at least 8X average 
coverage in Q20 bases and has been reviewed to rule out 
gross misassemblies, the low-quality ends of sequence 
contigs have been trimmed away, and each base is associated 
with a Phrap-derived quality score. 

Summary Statistics 

Sequencing vector: plasmid; n/a; 100% of reads 
Chemistry: Dye-terminator Big Dye; 100% of reads 
Assembly program: Phrap; version 0.990319 
Consensus quality: 184076 bases at least Q40 
Consensus quality: 185363 bases at least Q30 
Consensus quality: 185733 bases at least Q20 
Insert size: 152000; agarose-fp 
Insert size: 185870; sum-of-contigs 
Quality coverage: 7.20x in Q20 bases; agarose-fp 
Quality coverage: 5.89x in Q20 bases; sum-of-contigs 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 1 contigs. Gaps between the contigs 

* are represented as runs of N. The order of the pieces 

* is believed to be correct as given, however the sizes 

* of the gaps between them are based on estimates that have 

* provided by the submittor. 

* This sequence will be replaced 

* by the finished sequence as soon as it is available and 

* the accession number will be preserved. 

* 1 185870: contig of 185870 bp in length. 
FEATURES Location/ Qualifiers 

source 1. .185870 

/organism="Papio anubis" 

/mol_type="genomic DNA" 

/db_xref="taxon: 9555" 

/clone= M RP41-240D13" 

/clone_lib="RP41" 
misc_feature 1. .185870 

/note="assembly_fragment 

clone_end:T7 
vector side: left 



missing approximately 55 bases, including Sp6 clone end, 
on 3' end of insert" 
misc feature 121812. .185870 

~~ /note="clone overlaps with GenBank Accession Number 

AC130785 clone RP41-325P5 (center project name deh) " 

ORIGIN 

Query Match 59.3%; Score 2550; DB 2; Length 185870; 

Best Local Similarity 94.8%; Pred. No. 0; 

Matches 2717; Conservative 0; Mismatches 130; Indels 19; Gaps /; 

Ov 1430 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 1489 

|| | | | | | | | M I I I I M I I I I I I I I I I M I I I I I I M I M I I I I I I I I I 

Db 151685 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 

151626 

Ov 1490 AGT CGT G CT T AAAGT T C AAAGCT AAT GAT C ACGGAT AT GACAACT T C C GT T CCAGT AAT A 1549 

| | | | | | | | | | | | | | | I I I I I I I I I II I I I I I I I I I I I I I I M II I I I I I I I I I II 

Db 151625 AGTCGTGCTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATA 

151566 

Qy 1550 



AATACAGCTCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACC 1609 

|M | M I I I I I I II I I I I I I I I I I I I I I I I II I I I I I M I I I I I 

Db 151565 AATACAGCTCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACC 

151506 

Ov 1610 G AAGT CAT T AAAAC AAAAT G AAAC AT T T G C C AAAAC AAAAC AAAAAAC T AT GT AT T T G C A 1669 

Y | | I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I M I I I I I I I I M 

GAAGT CAT T AAAACAAAAT GAAAC AT T T GT CAAAACAAAACAAAAAACT AT GTAT T T G C A 



Db 151505 
151446 



.CACTATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATT 1729 



Ov 167 0 CAGCAl 

| | | | | | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
CAGCACACTATTAAAATATTAAGTGTAATTATTTTAACACTCATAGCTACATATGACATT 



Db 151445 
151386 



1789 



Ov 1730 TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGA 

| | | | | | | | | | | I I I I II | I M I I I II I I I I I I I I I I I M I I I I I I I I I I 

Db 151385 TTAT GAGCT GTTTACAGCAT GGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCATT GT GA 

151326 

Ov 1790 AAGCACTTAATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATT 1849 

| | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M 
Db 151325 AAGCACTTACTTTTTTATGGTTAGCACTTCAACATAGCTCTTAATAACTTCCAGGATATT 

151266 

Ov 1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 1909 

Ill | I I I II M I I I I I I M I I II II I I I Mill 

Db 151265 CACACAACCCTTAGGCTTAAAAATGAGCTCACTCGGAATTTCTATT TAAGAG 

151214 

Ov 1910 ATTTATTTTTAAATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGA 1969 

| | | | || | | | | | | | || I I I I II II II I II I I I I M I I I I I I I I I I I I I I I I 

Db 151213 AT TTAT T T T T AAAT C AAT GT GAAT CT GAT ACAAAGGAAGAGT AAGT C ACT GT AAAAC AGA 

151154 



1970 ACTTTTAAATGAAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTT 2029 



1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I 

Db 151153 ACTTTTAAATGAAGCTTAAATTACCCAATTTGAAATTTTAAAATCCTTTAAAAGAACTTT 

151094 

Ov 2030 TCAATTAATATTATCACACT-ATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTT 2088 

UNI II I MINI MINI I I MINIMI I MINI MM 

Db 151093 TTAATTAATATTTTCACACTGCTGATCAGACTGTAATTAGATGCAAATGAGAGAGTAGTT 

151034 

Ov 2089 TAGTTGTTGCATTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAA 2148 

| | | | | | || | | I II I I I II I II II I I M 

Db 151033 TAGTTGCTGTATTTTTTGGACACTAGAAACATTTAAATGATCAGGAGGGAGTAACTGAAA 

150974 

Ov 2149 GAGCAAGGCTGTTTTTGAAAATCATTACA CTTTCACTAGAAGCCCAAACCTCAGCAT 2205 

|| Ml I I I I I M I II I I I I I II I I I I I I II M I I I I M I M I I I I I II M I 

Db 150973 GAACAAGGCTGTTTTTGAAAATCATTACACTCCTTTCACTAGAAGCCCAAACCTCAGCAT 

150914 

Ov 2206 TCTGCAATATGTAACCAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGC 2265 

Mill MIIIIMM I Ml MIIMIIIMM 

Db 150913 T CTGCAAT AT GT AACCAACATGTTACAAACAAGCAGCAT GT AACAAACT GGCACATGT GT 

150854 

Ov 2266 CAGCTGAATTTAAAATATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGT 2325 

| | | | I I II II I I I I II I M I II I I M II I II I I I I II 

CAGCCAAATCTAAAATATAATACTTTTAAAAAGAAAATTATTACACCCTTTACATTCAGA 



Db 150853 
150794 



2385 



Ov 2326 TAAGATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTT 

M | | | | || M II I I I I M I M I I I II M I I I I I M M I M I I I I I I I I I II 

Db 150793 TAAGATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTGCCCCAAAAGACTTCTT 

150734 

Ov 2386 TGAATCTGTCATTCACATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTAT 2445 

| || || | || Ml I I I I I M II I I II I I II I I I II I I I I II I I I I II II II 

Db 150733 TGAATCTGCCATTCACACAGCCTGTGAAGAAAATACTATCTACAAATTTTTCAGGATTAT 

150674 

0 2446 TAAAATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAAT 2505 

M I I I II M I I MM II I I I II I I MIMI 

Db 150673 TAAAATCTTCTTCTTTCACTATTGTAGCTTAAACTCTGTTTGGTTTTGTCATCCGTAAAT 

150614 

Ov 2506 ACTTACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCT 2565 

Mill I M I M I M II II I II MIMI I II I II I M I I M II I 

Db 150613 ACTTAGCTACATACACTGCATGTAGACGATTAAACGAGGGCGGGCCCTGTGTTCATAGTT 

150554 

Ov 2566 TTACGATGGAGAGATGCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGT 2625 

I I I | MUM II I II II I M II M I I I M II II I I I I M Ml 

Db 150553 TTACAATGGAGAGATGCCAGTGACCTCATAATAGAGACTGTGAACTGCCTGGTGCGATGT 
150494 

Ov 2626 CCACATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTC 2685 

* M || | || II I I I II II II M M I II I I I I I I I M M I I I M I II I II I I I I M M M I 



Db 150493 CCACATGACAAGGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTAGTTAAAATGGTTTC 

150434 

Ov 2686 T AGC AT AT GT AT AAT GCT AT AGTT AAAAT ACT ATT T T T CAAAAT C AT AC AGAT T AGT AC A 2745 

| | | | | M I I I I II I I I I I Ml I I I II II I I I I I I I I 

Db 150433 TAGCATATGTATAATGCTGTAGTTAAAACACTGTTTTGCAAAATCATACAGATTAGTACA 

150374 

Ov 2746 TTTAACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAAT 2805 

| | M I I I I I I I I I I I I I I I I I I I I I II 

Db 150373 TTTAATGGCTACCTGTAAAGCTTATTACTAGTTTTTGTATTATTTTTGTAAATAGCCAAT 

150314 

Ov 2806 AGAAAAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGA 2865 

| | M | | | | | | | I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I 

Db 150313 AGAAAAGTGTGCTTGACGTGGTGCTTTTCTTTCACTTAGAGGCAAAACTGCTTTTTGAGA 

150254 

Ov 2866 CCGTAAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGT 2925 

| | | | I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 150253 CTGTAAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTAAATCTTCTAAGCAAAGT 

150194 

Ov 2926 GCCTTAGGATAGCTTGGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAG 2985 

| | | | | | | M I I II I I I I I I I I I I I I I I I I I I I I I I I N I I I I I M I I I I M I I I I I I I I 
Db 150193 GC CT T AGAAT AG CTT GGGAT GAGAT GT GT GT GAAAGT ATGTAC AAGAGAAAAC GGAAGAG 

150134 

Ov 2986 AGAGGAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAA 3045 

I I | | I I | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I 

Db 150133 AGAGGAAATGAGGTGGGGTGAGAGGAAACTCATGGGGACAGATTCCCATTCTTAGCCTAA 

150074 

Ov 3046 CGTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACAC 3105 

M | | I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I II I I II I I I I I I I I I I I I 

Db 150073 CGTTCGTCATTGCCTCGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACAC 

150014 



Ov 3106 AGTGCAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTC 

| | | | | I I I I II II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 

Db 150013 AGTGCAATGTTCTCAGAGTGACTTTAGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTC 



149954 



Ov 3166 TTAAAATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGA 

MINIM: I II I I II I I I I I II I 1 1 I I I I II I I I I I I I M I 1 1 1 1 I I I 

Db 14 9 953 TTAAAATATGCCCAAATTTTTACTTTTTTTTTCTTTTAGTAAACTGGGCCACATGTTGGA 



3165 



3225 



149894 



Ov 3226 AATAAGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAAC 3285 

| | | | | I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 149893 AATAAGCTAGTAATGTTGTTTTCTGTCAATATCGAATGTGATGGTGCAGTAAACCAAAAC 



149834 



Ov 3286 C CAACAAT GT GGCCAGAAAGAAAGAGCAAT AAT AATT AATTC ACACACCAT AT GGATT CT 3345 

| | | | | | | I I I II I M I I II I I I I I I I I I M I I I I I 

Db 149833 CCAACAATGTGGCCAGAAAGAAAGAGCAATAATGATTAATTCACATGCCATGTGGATTCT 



149774 



Ov 3346 ATTTATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCC 3405 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 149773 ATTTATAAATCACCCACAAACTTGTTTTTTAATTTCATCCCAATCATTTTTTCAGAGGCC 

149714 

Ov 3406 TGTTATCATAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATT 34 65 

| | | | | | | | M M I I I I I I I I I I I I I I II I I I I I I I II I M 

Db 149713 TGTTATCATAGAAGACATTTTAGACTTGCAATTTTAAATTAACTTTGAATCACTAATATT 

149654 

Ov 3466 TTCACAGTTTATTAATATA-TTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCAT 3524 

| | | | | | | | | M Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 149653 TTCACAGTTTATTAATATATTTTTATTTCTATTTAAATTTTAGATTATTTTTATTACCAT 

149594 

Ov 3525 GTACTGAATTTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAA 3584 

I | | || | | | | M II I I II M I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 149593 GTACTGAATTTTTATATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTGTAA 
149534 

Ov 3585 TTATCTT GCCAAATTTT GAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAA 3644 

Y | | | | | | I I I I I I I I I | | I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

Db 149533 TTATCTTACCAAATTTTGAAACTGCACACAAAAAGCATACTTGCATTATTTATAATAAAA 

149474 



Qy 3645 



TTGCATTCAGTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGT 3704 

| | || | | | | II I I I III I M IN I I I I I I I I I I I I I I I I I I 

Db 149473 TTGCATTCAGTGGCTTTTT-AAAAAAATGTTTGATTCAAAATTTTAACATACTGATAAGT 



149415 



AAACAAT T ATAATT T CTTTACAT ACT CAAAACCAAGATAGAAAAAGGTGCTAT CGTT 3764 



Ov 3705 AAG. , , , 

| | | | | | | | | I I II I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I M 

AAGAAACAATAATAATTTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATTATT 



Db 149414 
149355 



Ov 3765 CAACTTCAAAACATGTTTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATT 3824 

Y | || | | | I I I I I I I I I I II I I I I I I I M I II I I I M I I I I I I I I M I I 

Db 149354 TAACTTCAAAACATGTTTCCTAGTATTAAGAACTTTAATATAGCAACAGACAAAATTATT 



149295 



Ov 3825 GTTAACATGGATGTTACAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTT 

| | | I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I M 

GTTAACATGAATGTTACAGCTCAGAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTT 



3884 



Db 149294 
149235 

Qy 3885 

Db 149234 
149175 



ATTATCCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATA 3944 

| | | | | | | | | | | | | | | | I I I I I I I I I I II I II I I I I I I I M I I I I I I I I I I I I I I I M I I 
ATTATCCACTGCTAATGTGGATATATGTTCAAACACCTTTTAGTATTGATAGCTTACATA 

lAAAGGAATACAGTTTATAGCAAAACATGGGTAT GCTGTAGCTAACTTTATAAAAG 4004 



Q V 3945 TGGCC 

J 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 ii 1 1 minim; I I I I I I I I I I I I I I I M I I I 

Db 149174 TGGCCAAAGGAATACAGTTTATAGTGAAACATGGGTATACTGTAGCTAACTTTATAAAAC 

149115 

Qy 4005 TGTAATA' 



TAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGG 4064 



Db 149114 
149055 



I MIIIIIIIMI M MMI MINIM M MMIMI 

T GTAAT AT AACAAT GT AAAAAAT TAT AT AC CT GGG G GGAT TTTTTGGTTGCT T AAAGT GG 



Qy 



Db 149054 
148995 



4065 CT AT AGT T AC T G A- T T T T T TAT TAT GT AAGCAAAAC CAAT AAA AATTTAAGTTTTT 4119 

| | | | | | | | | | | | | | I I I I II I I I I I M I I I M M I II I I II I M I II I M 

CTATAGTCACTGATTTTTTTATTATGTAAGCAAAACCAATAAACTTTAGGTTGTGTTTTT 



CTACCTTATTTTTCACTGTACAGACACTAATTCATTAAATACTAATTGATTGT 4179 



Ov 4120 TTAACAA* , , 

Y I I I I I I II M IIIMIIIMI I II M I I I M II I M I II II M I II M I I I M I I - 

Db 148994 TTAACAACTAGCTTATTTTTCATTGTACAGGCACTAATTCATTAAATACTAATTGACTGT 
148935 

Qy 4180 TT 



1 AAAAGAAAT AT AAAT GT GACAAGT GGAC AT T ATT TAT GT T AAAT AT AC AAT TAT C AAG 4239 



Db 148934 
148875 



I | M | | | | | | | | | | I I II I II M I II II M I I I M II II I M I I M MM' 1 

T T AAAGGAAAT AT AAAT GT GACAAGT GGACACT AT T TAT GT T AAAT AT ACAAT CAT C AAG 



Ov 4240 CAAGTATGAAGTTATTCAATTAAAATGCCACATTTCTGGTCTCTGG 4285 

| | | | | M I II I II M M M I II I II M I M M II I M II I II I II 

Db 148874 GAAGTAT GAAGTT ATT CAAT TAAAAT GCCACATTT CT GGT CT CT GG 148829 



RESULT 13 

AR165435 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
FEATURES 

source 



AR165435 1873 bp DNA 

Sequence 13 from patent US 6280931. 
AR165435 

AR165435.1 GI:16240327 



linear PAT 17-OCT-2001 



Unknown. 

Unknown . 

Unclassified. 

1 (bases 1 to 1873) 

Sakamoto, A. and Hanaoka,F. 

Method for specifically amplifying a dystroglycan, 

. alpha. -sarcoglycan, or endothelin Breceptor cDNA of an extremely 

small 

Patent: US 6280931-A 13 28-AUG-2001; 
Location/Qualifiers 
1. .1873 

/organism="unknown" 
/mol_type =,, unassigned DNA" 



ORIGIN 



Query Match 39.3%; 
Best Local Similarity 99.6%; 
Matches 1696; Conservative 



Score 1691.8; DB 6; 
Pred. No. 5.5e-293; 
0; Mismatches 7; 



Length 1873; 
Indels 0; Gaps 



Qy 

Db 

Qy 

Db 



0; 



237 



178 TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 

II || I I I I I I II I II I M M I I I I M I I II I M I I M M M I M I M M M I M M 

TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 230 



171 



297 



238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 

I MINIMI III I II MM III I II III II M I II II Ml II Ml 

2 3 1 ATGCAGCCGCCTCCAAGTCTGTGCGGACCGGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 2 9 0 



2 9 8 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 3 5 7 

, | ! | | | | | | ! I M I I I I , ! I I I , I ' I M I I ! I I I M I I I I I I HI II I I 

2 9 1 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 3 5 0 

358 CAAACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCC 417 

I I M I I I I I M I I M I I I I I I I I M I I I I I I M l I liiliiiiiiiliiiiiiiiij. ' ' ' 
351 " " """" 



CAAACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCC 4 1 0 



4 1 8 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 477 



8 98 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 

| | | | | | | M I I I I I I I M I I I II I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I M I 

8 91 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 

9 5 8 GATATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTT 

| | | | | | | | | | | | | | M I I I I I I M I I I M M I I I I M I I II I I I I I I I I M I I I 

9 5 1 GATATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTT 



890 

957 

950 

1017 

1010 



1018 
1011 
1078 
1071 



CAGAAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTC 1077 

I I I I I I | | | | I I I I I II I I I I I I M I M I I I I I I M III 

CAGAAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTATTCAGTTTC 



TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 

I | | | | | | | || | | | | || I I I I I II I I I I I I I I I I I I I M I I I I I M I I M I I I I I I M I I I 

TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 



1070 
1137 
1130 



1138 T T GAGAAAGAAAAGT G G CAT GCAGAT T GCT T T AAAT GAT C ACCT AAAGCAGAGAC GGGAA 
| | | | | | | | | | | | | | | I I I I I I I I 1 I I I I I I I I I I I I I I I I M I I I I I N I I I I I I I I I I I 
1131 TTGAGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAA 

1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 

| | | | | | | | | M | | | | I I I I I I I I I I II I I I I I I I I I I I I I I I I M I M I 

1191 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 

1258 CTCAGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTT 

| | | | | | | | M I I I I I I I I I I M I I M I I II I I I I I II I I I I I I I I I I I I I I I I I I 

1251 CTCAGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTT 

1318 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 

| | | | | | | | | | | | | | M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1311 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 

1378 ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 

Mlllllllllllllllllll | I I I I II I I I I M I I I I I I I I I I I I I 

1371 ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 

1438 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 

| | | | | | M | | | | I I MM I I I I I II II I I I I I II I I II I I MM 

1431 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 

1498 TTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGC 
| || || | | | || | | || | I I I I I II II I II I I M II I I I I M I I I II I I I I I I I I M I II I I I 
1491 T T AAAGT T C AAAGCT AAT GAT C AC GGAT AT GACAACT T C C GTT CC AGT AAT AAAT ACAGC 

1558 TCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCAT 

| | | | | | | | | | | | | | || I I I II M II I I I I I I I I M II II II I I I I M I I I II M M II II 
TCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCAT 



1551 
1618 



TAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACAC 

I I ! I I I I II I II I II M I I II M II I I I II I I I II M I I II M I I I I II II I I II I I I II 

1611 TAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACAC 



1678 TATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGC 

| M M | || | | M | | || II I II II I I II I I I I I I M II I I I I I I I M II I I I M M 

1671 TATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGC 



1738 T GTTT AC GGCATGGAAAGAAAAT CAGTGGGAATTAAGAAAGCCT CGT CGT GAAAGCACTT 

| | | | | | | M | | | | | || I I I I II I II II I I I M I II II I I I I I I I I M II I I M II 

1731 TGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTT 

1798 AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAAC 

| | | || || M II I I I II II I I M II I II I II I I I II I I I I I I I M I I I I M 

1791 AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAAC 

1858 ACTTAGGCTTAAAAATGAGCTCA 188 0 

M I I I M M I I I I I I I I I I M II 
1851 ACTTAGGCTTAAAAATGAGCTCA 1873 



1197 
1190 
1257 
1250 
1317 
1310 
1377 
1370 
1437 
1430 
1497 
1490 
1557 
1550 
1617 
1610 
1677 
1670 
1737 
1730 
1797 
1790 
1857 
1850 



14 



El524 2 1873 bp DNA linear PAT 28-JUL-1999 

TION Human mRNA for endothelin B receptor, complete cds. 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



E15242 

E15242.1 GI:5709925 
JP 1998057064-A/13. 
Homo sapiens (human) 

Homo sapiens . ■ . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi , 
Mammalia; Eutheria; Primates; Catarrhini; Hominiciae; Homo. 
1 (bases 1 to 1873) 
Sakamoto, E . and Hanaoka,F. 

SPECIFIC AMPLIFICATION OF MINOR GENE PRODUCT 
Patent: JP 1998057064-A 13 03-MAR-1998; 
RIKAGAKU KENKYUSHO 
OS Homo sapiens (human) 
JP 1998057064-A/13 
03-MAR-1998 

16-AUG-1996 JP 1996216506 
SAKAMOTO EIJI, HANAOKA FUMIO 
Cl2N15/09,C07H21/02,C07H21/04//C12Ql/68; 

strandedness : Double; 
topology: Linear; 



PN 
PD 
PF 
PI 
PC 
CC 
CC 
FH 
FH 
FT 
FT 
FT 

FT 



Key 



source 



231. .1559 



Location/Qualifiers 
1. .1873 

/organism='Homo sapiens 1 
/tissue__type= f peripheral blood 1 FT 



CDS 



FEATURES 

source 



ORIGIN 



/product='endothelin B receptor' 
Location/Qualifiers 
1. .1873 

/organism="Homo sapiens" 
/mol__type=" genomic DNA" 
/db xref="taxon:9606" 



Query Match 
Best Local Similarity 
Matches 1696; Conservative 



39.3%; Score 1691.8; DB 6; Length 1873; 
99.6%; Pred. No. 5.5e-293; 

0; Mismatches 7; Indels 0; Gaps 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



178 TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 

M || | M It I I M I It I I I I I I I I I I I I I 1 I I M I M M 11 I I I M t 1 I I I " I >" I 

171 TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 
238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 

M | I I I I I M I I I I I I M M M M I I M 11 i I M I M M I I I I I M 11 I I I I 

231 ATGCAGCCGCCTCCAAGTCTGTGCGGACCGGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 
298 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 

I I Ml MINI MINIMI I MINIMI M MINIM I MM M I Ml I Ml 

291 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 
358 CAAACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCC 

II I I I II M I I II 11 I I II I II I I N I N II M M M N II I II II I I ''''''' 

351 CAAAC C GCAGAGAT AAT GAC GC C AC C C ACTAAGAC CT T AT GGC C CAAGGGT T C CAAC GC C 
418 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 

| | | | | | | | | | | M M I N I II I II II I N N I N II II I II I II I I I II I I MN 

411 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 



0; 
237 
230 
297 
290 
357 
350 
417 
410 
477 
470 



478 C C G C C ACG C AC CAT CTCCCCTCCCCCGTGC C AAG GAC C CAT C GAGAT CAAGGAGACT T T C 537 



| | | | | | | | | | | I I M I I I I M I I I I I I I I M I I I I I I I I I I I I 

C C GC C AC GC AC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAG GAGACT T T C 

AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 

I | | | | | | | | | | | | M | II I I I II I I I II MINI I I mi I MM M I M MJ^ 



531 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 590 
598 AC ACT T CT G AGAAT TAT CT ACAAGAACAAGT GCAT GCGAAAC GGT C C CAAT AT CT T GAT C 657 



831 
898 
891 



CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 890 

957 
950 



GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 

M | I I I I M I I I I M I M I I I I I I I I I I I I M M I IN I I I I III I I I I I M I I M I I II 

GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 



958 GAT AT AATT AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CT GCT T GCT T CAT C C C GT T 
| | | M I I II I I M M I I I I I I I M I M I I I I I I I I I I I I II I I I I I Ml I I Ml I I I I I I 
GATATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTT 



951 
1018 



C AGAAGAC AGCT T T CAT GCAGTT T T ACAAGACAGCAAAAGAT TGGTGGCTGTT C AGT T T C 

I I M M I I M I I I I I I I I M I M I 1 I I M I I M II I I I I I M I I I II Mill 

1011 C AGAAGAC AGCT T T CAT GC AGT T TT ACAAGACAGCAAAAGAT T GGT GGCT AT T CAGT T TC 



107 8 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 

I I M I I I I I M Ml I IN I I II I I I I I I I I I I I M I I I I M M I I II I I I I I I I I 

1071 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 
1138 T T GAGAAAGAAAAGT GGC AT GCAGATT GCTTTAAAT GAT C AC CT AAAGCAGAGACGG GAA 

I I I | | I II I II II I M I I I I I I I I I N I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

T T GAGAAAGAAAAGT GGC AT G C AGAT T GCT T T AAAT GAT C AC CT AAAGCAGAGAC GGGAA 



1131 
1198 
1191 
1258 



GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 

M I | IN I I I I I I I I I M t I I I I 1 I I 1 i I 1 I I I Ml I I I I I I I I I I Ml 

GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 



CT CAGCAGGATT CT GAAGCT CACT CTTT AT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T 

I | | | | M | | | | || | | Ml I I I II M II M I I I II I I II II II I I I I I M I I I I M 

1251 CTCAGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTT 



1017 

1010 

1077 

1070 

1137 

1130 

1197 

1190 

1257 

1250 

1317 

1310 



Qy 


1318 


Db 


1311 


Qy 


1378 


Db 


1371 


Qy 


1438 


Db 


1431 


Qy 


1498 


Db 


1491 


Qy 


1558 


Db 


1551 


Qy 


1618 


Db 


1611 


Qy 


1678 


Db 


1671 


Qy 


1738 


Db 


1731 


Qy 


1798 


Db 


1791 


Qy 


1858 


Db 


1851 


RESULT 


15 



TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 1377 

I I I M | | | | | | | | | | | | I I I I I I I I I I M I I I I I I II I I I I I I I I I ■ 

TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 1370 

ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 1437 

! I I | I | 1 I I 1 1 I MINIMI III IN I II NIMIIMMIIIM 

ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 1430 



TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 

I m I I I II I I I I M I II I I I I I I I I I I I I I I I I I I I N I I I I M I I I II I I I I M I I I I I 

TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 
TTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGC 

I | | | || | | | M I II I I I I M I I I I I I I I M I I M I M I I I I I M I I I I I I I I I M 

TTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGC 
TCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCAT 

I i U i I M M I I I I I I I I I M I I M I I I I I I I I I I I I M I I II I I I I I II 

TCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCAT 

TAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACAC 

I I | | | | | | | I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

TAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACAC 

TAT T AAAAT AT T AAGT GT AAT T ATTT T AACACT C AC AGCT AC AT AT GACAT T T TAT GAG C 

I II I I I I I I I I I I I I I I I I I I M I MUM 

TAT T AAAAT AT T AAGT GT AAT T ATTT T AACACT C AC AGCT AC AT AT GACAT T T TAT GAGC 
TGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTT 

I | | | | | | | | M I II M I II I I I I I I I I I I I M II I I II I I I I I II I N 

TGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTT 
AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAAC 

I | | M M I I I I M I I I I M I I I I I I I M M I I I I I I I M I I I M I I I I M I II I 

AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAAC 1850 



M M I I M I I I I I I I M I I M II 



1497 

1490 

1557 

1550 

1617 

1610 

1677 

1670 

1737 

1730 

1797 

1790 

1857 



LOCuf S44866 1872 bp mRNA linear PRI 07-MAY-1993 

DEFINITION ETB endothelin receptor [human, mRNA, 1872 nt] . 

ACCESSION S44866 

VERSION S44866.1 GI:233233 

KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 1872) 

AUTHORS Sakamoto, A., Yanagisawa,M. , Sakurai,T., Takuwa,Y., Yanagxsawa, H . 

and Masaki,T. 

TITLE Cloning and functional expression of human cDNA for the ETB 

endothelin receptor 



JOURNAL Biochem. Biophys . Res. Commun. 178 (2), 656-663 (1991) 
MEDLINE 91315496 
PUBMED 1713452 

REMARK GenBank staff at the National Library of Medicine created this 
entry [NCBI gibbsq 44866] from the original journal article. 
This sequence comes from Fig. 1. 
FEATURES Location/ Qualifiers 

source 1. .1872 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
gene 1. .1872 

/gene= ,f ETB endothelin receptor" 
CDS 231. ,1559 

/g ene ="ETB endothelin receptor" 
/note="This sequence comes from Fig. 1" 
/codon_start=l 

/product="ETB endothelin receptor" 
/protein_id="AAB19411 . 1" 
/db_xref="GI: 233234" 

/t ran slation="MQPPPSLCG PAL VALVLACGL SRI WGEERGFPP DRAT PLLQTAE 
IMTPPTKTLWPKGSNASLARSLAPAEVPKGDRTAGSPPRTISPPPCQGPIEIKETFKY 
INTWSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIVIDIPINVY 
KLLAEDWPFGAEMCKLVPFIQKASVGITVLSLC7VLSIDRYRAVASWSRIKGIGVPKWT 
AVEIVLIWWSWLAVPEAIGFDIITMDYKGSYLRICLLHPVQKTAFMQFYKTAKDWW 
LFSFYFCLPLAITAFFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFAL 
CWLPLHLSRILKLTLYNQNDPNRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKR 
FKNCFKSCLCCWCQSFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS " 

ORIGIN 

Query Match 39.3%; Score 1690.8; DB 9; Length 1872; 

Best Local Similarity 99.6%; Pred. No. 8.2e-293; 

Matches 1695; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 237 

|| || I I I I I I I I I I I I I I I I I I M I I M I I M M I I I M I I I II I I M I I I 

TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 230 



QY 


178 


Db 


171 


Qy 


238 


Db 


231 


Qy 


298 


Db 


291 


Qy 


358 


Db 


351 


Qy 


418 


Db 


411 


Qy 


478 


Db 


471 



I I I I I I I I M | I I I I I I I I I I I I I I M I I II I I I I I M II I I I I I M I I I 



290 
357 



CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 

| | | | | | | | | | | || I I I II I I I II I I I I I I I II I II I I I I M M I I I I I I I I M I I I I I I I 
CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 350 



CAAAC C G C AGAGAT AAT GAC G C CAC C C ACT AAGAC CT T AT GGC C CAAGG GTT C CAAC G C C 

| | | | | | I I I I M I II I I I I I I I I I I I I I I I M I II I I M I I I I M I I I I I I I I I I I I I I I 

CAAAC C G CAGAGAT AAT GAC G C CAC C C ACT AAGAC CT T AT GGC C CAAGG GT T C CAAC GC C 



I | | | M I I I M I I I I I I M I M I I I I I I I I M I I I I I I I I I I I M I I I M M I I I I I I I I 

AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 

CCGCCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTC 

| | | | | | | | | | || | || I I I I I I I I I I I I M I I II I I I I I I I I I I I M I I I M I I I I I I I M 
C C G C C AC GC ACC AT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT C AAGGAGACT T T C 



417 
410 
477 
470 
537 
530 



Qy 538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 

| | I I I I I I I I I 1 I I I I I I I I I I 1 I I I II II I I I I I M I I M I I I M I I I I I I I II I I I I I 
Db 531 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 590 

Qy 598 ACACTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATC 657 

| || I I I I M I I I I I I I I M I I I I I I I I I M I I M I I I I I I I I I I I I I I 

Db 591 ACACTTCTGAGAATTATCTACAAGAAC7^GTGCATGCGAAACGGTCCCAATATCTTGATC 650 

Qy 658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M 
Db 651 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 710 

Qy 718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

| | | I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I M I I M I I I M I I M I I I I I I I I I 
Db 711 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 770 

Qy 778 C AGAAAG C C T CC GT GG GAAT C ACT GT GCT GAGT CT AT GT GC T C T GAGT AT T GACAGAT AT 837 

| I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I II I I I I I I I M II I I i I I I I 

D b 771 CAGAAAGCCT CCGT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAG TAT T GACAGAT AT 830 

Q y 838 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 8 97 

| | | I I I M I II I I I I I II I I I I I M I I I I I I I I I I M I I M I I I I I I I I I M I I I 

Db 831 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 8 90 

Qy 8 98 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

| | | || | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I M I I I I II I I 
Db 891 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 950 

Qy 958 GAT AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T 1017 

I | | | I I I I I I I M I I I I I I I I M M I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I M 
Db 951 GATATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTT 1010 

Qy 1018 C AGAAGACAGCT T T CAT GCAGT T TT ACAAGACAGCAAAAGAT TGGTGGCTGTT CAGT T T C 1077 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I II I I I I I I 
Db 1011 CAGAAGACAGCTTT CAT GCAGTT TT ACAAGACAGCAAAAGATT GGT GGCTAT TCAGTTT C 1070 

Qy 1078 TAT TTCTGCTT GC C AT T GGC CAT CACT G CAT t T T T T T AT ACACT AAT GAC CT GT GAAAT G 1137 

| | | | | | | I || I I || I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I II 
Db 1071 TAT TTCTGCTT GC CAT T GG C CAT CACT GC AT T T T T T TAT ACACT AAT GAC CT GT GAAAT G 1130 

Qy 1138 T T GAGAAAGAAAAGT G GC AT G C AGAT T GCT T T AAAT GAT C AC CTAAAGC AGAGAC GG GAA 1197 

t | | || I I I I I I I I I I I I I I I I M I I I I M I I I I I I M I I I I I I I I I I I I I I II I I M I I I 
Db 1131 T T GAGAAAGAAAAGT G GC AT GC AGAT T G CT T T AAAT GAT C AC CTAAAGC AG AGAC GGGAA 1190 

Qy 1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1257 

I | | | | || I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I M I I I I I I I 
Db 1191 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1250 

Qy 1258 CT C AGCAGGAT T CT GAAGC T CACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T 1317 

| M | I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M 
Db 1251 CTCAGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTT 1310 

Qy 1318 TTGAGCTTTCTGTT GGT ATT GGACT AT ATT GGT AT CAACATGGCTT CACT GAATTCCTGC 1377 

| | | | M I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I 
Db 1311 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 137 0 



Qy 


1378 


Db 


1371 


Qy 


1438 


Db 


1431 


Qy 


1498 


Db 


1491 


Qy 


1558 


Db 


1551 


Qy 


1618 


Db 


1611 


Qy 


1678 


Db 


1671 


Qy 


1738 


Db 


1731 


Qy 


1798 


Db 


1791 


Qy 


1858 


Db 


1851 



AT T AAC C C AAT T GCT CT GT AT TT G GT GAGCAAAAGAT T CAAAAACT GC T T T AAGT CAT GC 1437 

1 I M | | | | M I I I I I I M I I I I I M I I I II I I I I I I I I I I I I I M I I I I I I I I I M I I M 

ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 1430 
TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 1497 

IN || II I I I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I 

T TAT G CT G CT G GT GC C AGT CAT T T GAAGAAAAACAGT C CT T GGAGGAAAAGC AGT C GT GC 1490 
TTAAAGTT CAAAGCTAAT GAT CACGGATAT GACAACTT C C GTT CCAGTAATAAATACAGC 1557 

I I | I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

T T AAAGT T CAAAG CT AAT GAT C AC GGAT AT GACAACT T C C GT T C C AGT AAT AAAT AC AG C 1550 
T CAT CT T GAAAGAAGAACT AT T C ACT GT AT TT CATT T T CT T TAT AT T GGAC C GAAGT CAT 1617 

| | || | | | | | | | | | I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

T CAT CT T GAAAGAAGAACT AT T C ACT GT AT T T CATT T T CT T TAT AT T G GAC C GAAGT CAT 1610 
T AAAAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAAC TAT G T AT T T G C AC AG C AC AC 1677 

I | || | | | I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I 

TAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACAC 1670 
TAT T AAAAT AT T AAGT GT AAT TAT T T T AAC ACT C AC AGC T AC AT AT GAC AT T T TAT GAGC 17 37 

I I I I I M | I I I I I I I I I II M I I I I I I I I I M I I I I I I I I I I I I I I I I I I II Ml 

TATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGC 1730 
T GT T T AC GGC AT G GAAAGAAAAT C AGT GGGAAT T AAGAAAGC CT C GT CGT GAAAGCACT T 1797 

I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I II I II I 

T GT T T AC G GC AT GGAAAGAAAAT C AGT GGGAAT T AAGAAAGC CT C GT CGT GAAAG CACT T 1790 
AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTT CCAGGATATT CACACAAC 1857 

I | | | | M I I I I I I I I I I I M I I I I I I I II I I I I I I I I I M I II t I I I II I I I I I I I I I I I 

AAT T TT T T AC AGT T AGC ACT T CAAC AT AGCT CT T AACAACT T C C AG GAT AT T CACACAAC 18 50 

ACTTAGGCTTAAAAATGAGCTC 1879 

I I II I I I I I I I I I I I M I I I I I 
ACTTAGGCTTAAAAATGAGCTC 1872 



Search completed: May 14, 2004, 10:14:31 
Job time : 16263.7 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



May 13, 2004, 23:15:48 ; Search time 1503.76 Seconds 

(without alignments) 
12150.511 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-09-931-157-2 

" gagacattccggtgggggac ctgggaaaaaaaaaaaaaaa 4301 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



6747726 



Searched: 3373863 seqs, 2124099041 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : N_Geneseq_29 Jan04 : * 

1: geneseqnl980s : * 
2 : geneseqnl990s : * 
3: geneseqn2000s : * 
4 : geneseqn2001as :* 
5 : geneseqn2001bs : * 
6: geneseqn2002s : * 
7: geneseqn2003as:* 
8 : geneseqn2003bs :* 
9 : geneseqn2003cs : * 
10: geneseqn2004s :* 

Pred No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 



Query 





Score 


Match 


Length 


DB 


ID 


1 


4297. 


8 


99. 


9 


4301 


2 


AAQ34584 


2 


4284. 
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6 
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6 
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7 


ABZ96978 


6 


4284 


.4 


99 


.6 


4286 


7 


ACC72646 


7 


4284 


.4 


99 


.6 


4286 


7 


ABZ42661 



Description 



Aaq34584 ETb recep 
Aaa35162 Human ade 
Aaf21284 Human low 
Abv94186 Breast ca 
Abz96978 Human nuc 
Acc7264 6 Human end 
Abz42661 Human end 
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2 
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17 
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3 
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3 
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19 
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Adb37423 
Addl8443 
Aaf21288 
Abz96982 
Aaa35166 
Ach03911 
Abk94410 
Abq77402 
Aavl7875 
Aaa35161 
Aaf21283 
Abz96977 
Ach03912 
Ach03913 
Aaf21285 
Abz96979 
Aaa35163 
Aca56605 
Aad24 966 
Abi97988 
Abx74409 
Aaa35165 
Aaf21287 
Abz96981 
Adb52872 
Abi99321 
Aaq25892 
Aaq53922 
Abs51841 
Abk94409 
Abl63647 
Abl64653 
Abn95562 
Ach20099 
Aaq63209 
Aaa34793 
Aaa34781 
Aaf20915 



Human can 
Human pro 
Human low 
Human nuc 
Human ade 
Human cDN 
DNA encod 
Human EDN 
Homo sapi 
Human ade 
Human low 
Human nuc 
Human cDN 
Human cDN 
Human low 
Human nuc 
Human ade 
Human sig 
Human G-p 
Non-endog 
Human cDN 
Human ade 
Human low 
Human nuc 
Primary r 
Mouse isc 
Sequence 
Bovine ET 
Novel hum 
DNA encod 
Breast ca 
Stomach c 
Gene #206 
Human adu 
Human end 
Human ade 
Human ade 
Human end 



ALIGNMENTS 



RESULT 1 
AAQ34584 

ID AAQ34584 standard; DNA; 4301 BP. 
XX 

AC AAQ34584; 
XX 

DT 25-MAR-2003 (revised) 

DT ll-MAY-1993 (first entry) 

XX 

DE ETb receptor gene. 
XX 



KW Human; ETa; ETb; endothelin; receptor; transmembrane domain; N taxi, 
KW extracellular; cytoplasmic; C tail; post translational ; bovine; 
KW modification; ET-1 receptor; antagonist; circulatory system; ss. 



XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 238. .1566 

FT /*tag= a 

FT sig_peptide 238 . .315 

FT /*tag= b 

FT mat_peptide 316. .1563 

FT /*tag= c 

FT misc_feature 1909. .1913 

FT /*tag= i 

FT /function^ "Related with mRNA instability" 

FT misc_feature 1997. .2001 

FT /*tag= j 

FT /function= "Related with mRNA instability" 

FT misc_feature 2119. .2123 

FT /*tag= k 

FT /function^ "Related with mRNA instability" 

FT misc_f eature 2273. .2277 

FT /*tag= 1 

FT /function= "Related with mRNA instability" 

FT polyA_signal 2595. .2600 

FT /*tag= d 

FT misc_f eature 2745. .2749 

FT /*tag= m 

FT /function= "Related with mRNA instability" 

FT polyA_jsignal 3134. .3139 

FT /*tag= e 

FT misc_f eature 3346. .3350 

FT /*tag= n 

FT /function^ "Related with mRNA instability" 

FT misc_f eature 3484. .3488 

FT /*tag= o 

FT /function^ "Related with mRNA instability" 

FT misc_f eature 3495. .3499 

FT /*tag= p 

FT /function= "Related with mRNA instability" 

FT misc_f eature 3632. .3636 

FT ~ /*tag= q 

FT /function= "Related with mRNA instability" 

FT polyA__signal 3638. .3643 

FT /*tag= f 

FT misc_f eature 3852. .3856 

FT /*tag= r 

FT /function= "Related with mRNA instability" 

FT polyA_signal 4101. .4106 

FT /*tag= g 

FT misc_f eature 4108. .4112 

FT /*tag= s 

F T /function= "Related with mRNA instability" 

FT misc__f eature 4213. .4217 

FT /*tag= t 

F T /function^ "Related with mRNA instability" 

FT polyA_signal 4258. .4263 

FT /*tag= h 
XX 



PN EP522868-A1. 
XX 

PD 13-JAN-1993. 
XX 

PF 10-JUL-1992; 92EP-00306347 . 
XX 

PR 12-JUL-1991; 91JP-00172828. 
XX 

PA (SHIO ) SHIONOGI SEIYAKU KK. 
XX 

PI Imura H, Nakao K, Nakanishi S; 
XX 

DR WPI; 1993-010677/02. 

DR P-PSDB; AAR30886. 
XX 

PT Human ETa and ETb endothelin receptors - for measuring endothelin and 

PT screening for endothelin antagonists. 

XX 

PS Claim 12; Fig 2; 39pp; English. 
XX 

CC The sequences given in AAQ34583-84 encode the human ETa and ETb 

CC endothelin receptors respectively. ETa is a 427 amino acid protein with a 

CC molecular weight of 48,726. ETb comprises 442 amino acids and has a 

CC molecular weight of 49,629. ETa has a higher affinity for endothelin (ET) 

CC -1 and ET-2, whereas ETb has no selectivity for ET-1, ET-2 or ET-3. The 

CC receptors each contain seven transmembrane domains and have an 

CC extracellular N tail and a cytoplasmic C tail. There are several 

CC potential sites for post translational modification, these sites are 

CC identical to those of bovine ET-1 receptor. ETa cDNA is 91.2% homologous 

CC to bovine ET-1 receptor cDNA and ETb cDNA is 61.1% homologous to that of 

CC bovine ETa-receptor . The receptor proteins are useful as reagents for 

CC measuring the amount of ET or screening for antagonists of the ET 

CC receptor when studying the circulatory system. (Updated on 25-MAR-2003 to 

CC correct PN field.) 



XX 
SQ 



Sequence 4301 BP; 1342 A; 830 C; 815 G; 1314 T; 0 U; 0 Other; 

Query Match 99.9%; Score 4297.8; DB 2; Length 4301; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4299; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 



GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

| | | | | I I I M II I II I I I I I I I I M I I I I I I I I M I I I I M I M I 1 I I I I M 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

| | | | | | | | | | || I I I I I I I II I I I I I I I II I I I I I I I M I I I I I I I I 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 



60 
60 
120 
120 
180 



AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGTACTT GGAGT CT GGACAT CT GA 

I | | | | | I I I I I I I I I I I I I I I I I II I I I I M M I I I I M I I I I M I II I I I M I I 

AG GAT CAACACAGT GGCT GAACACT G G GAAGGAACT GGT ACT T GGAGT CT GGACAT C T GA 18 0 



| | | | | | | | | | | I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 



241 
241 
301 
301 
361 



CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | | | | | | | | I I II I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

| | | | | || | | | | | | | | | | I I I I I I I I II I II I I I I I I I I I I M I I I I I I I I I I I I M I I I I 
TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 



ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

| | M I I I I I I I I I I I I I I I I I I I I M I I I M I I II I I I I I I M I I I I I I I I I I I 

361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 



421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

I | | | M || | | | || M I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

| | | | | | | | | || | || | I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I 

481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

| | | | I II I I I I M I I I I I I I I I I I I I I I I I I I I M I I I II I II II I I I I I I I I M M I I 
541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTCCTGGGGATCATCGGGAACTCCACA 600 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

| | | | | | | | | | | | M | I I I I I I M I I I I I I I II I I I I I I I M I I I I I I I I I I M M II I I I 
601 CTT CT GAGAATT AT CTACAAGAACAAGT GCATGCGAAACGGT CCCAATATCTT GAT CGCC 660 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

| | M I I I I I M M I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 72 0 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 

| | | | | | | | | I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 



7 8 1 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 84 0 

| | | | | M | | | | | || I I I I I I I I I I I I I I II I I I I I I I I M II I I M I I I I I M I I I I I I 
7 81 AAAGCCTCCGTGGGAATCACTGTCCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 



841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

I | | | | M | | | || M I I II I I I I I I I I I M I II I I I I M I I M I I I I I I I I I I I I 

8 41 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

| | | || | | | | | | M I I I I I I I I I I II I I I I I I I I I M I I I I I II I I I I I I I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 



840 
900 
900 
960 
960 
1020 
1020 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

| M I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I II I I I I I I I I I I I I I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 



1080 



1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 



1081 TTCTGCTTGC C ATT GGC C AT C AC T G CAT T T TT T TAT AC AC T AAT GAC CT GT GAAAT GT T G 114 0 

1141 AGAAAGAAAAGT GGC AT GC AGAT T GCT T T AAAT GAT C AC CT AAAGC AGAGAC GGGAAGT G 1200 

| | | | | 1 M I I I I I I I I M 1 M I I I 1 I I I I I I I I I I I 1 I I I I M I I I I I I I I I I I I I I I 1 I 
1141 AGAAAGAAAAGT GGCAT GCAGATT GCT TTAAAT GAT CACCTAAAGCAGAGAC GGGAAGT G 1200 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 12 60 

| | | | | | | | I I I I I I I I I I I I I I I M I I II I I I I I I I I M I I I I I I I I i I I I I I I I I I I I I 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

1261 AGCAGGAT T CT GAAG CT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 1320 

I M I I I I II I I I I I I M I I I I I I I I I I M M M I II I I I I I I I I I I I I I M I I M I I I I I 

1261 AGCAGGAT T CT GAAG C T CACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T TT G 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 138 0 

| | | | | | | | I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I II I I I I I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 138 0 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

| | | | | | | | | | | | M I II I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I II I I I I I I II 
1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I | | M | I I I I I I I I I I I I I I I I M II I I M I M I I II I M I I I I I I I I I I I I M I II I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGT T C AAAGCT AAT GAT C AC GGAT AT GAC AACT T C C GT T C C AGT AAT AAAT AC AG C T CA 1560 

| | | | | | | | | || || I I I I I I II I I I I I II I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I 
1501 AAGT T C AAAGCT AAT GAT C AC GGAT AT GAC AACT T C C GT T C C AGT AAT AAAT AC AG CT C A 1560 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

| M II I I II I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I 
1561 T CT T GAAAGAAGAACT AT T CACT GT AT T T C ATT T T CT T TAT AT T GGAC C GAAGT CAT T AA 1620 

1621 AACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACT AT GTATTT GCACAGCACACT AT 1680 

| | | | | | | M | I I II I I I I I I I I M I M I I I II I I I I I I I I I I I I I I I I M I I I I I M I I I 
1621 AACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT TT G C ACAGCAC ACT AT 1680 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | M I I I I I I I I M M I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
1681 TAAAAT AT T AAGT GT AAT TAT T T T AACACT CACAGCT AC AT AT GAC AT TT TAT GAGCT GT 174 0 

1741 TT AC GGC AT G GAAAGAAAAT CAGT GG GAATTAAGAAAG C CT C GT C GT GAAAG CACT T AAT 1800 

| M I I I I I I I I I I I M I I I II I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1741 TT AC G GC AT GGAAAGAAAAT CAGT G G GAAT T AAGAAAG C CT C GT CGT GAAAGCACT T AAT 18 00 

18 01 TT T T T AC AGT T AGC ACT T C AAC AT AG CT CT T AACAACT T C CAGGAT AT T CAC ACAAC ACT 18 60 

| || I I I I I I I I II I M I I I M I I I I I I I I I M I I I I I M II I I I I I I I I I I I I I I I I I I I 
1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AAT CAAT GG GACT CT GAT AT AAAGGAAGAAT AAGT CAC T GT AAAAC AGAACT T T T AAAT G 1980 
I | | | | | | | | | | | I I I I I I I I I I II I I I I I I I I I I I I I I M M I I I I I I I I I I M I I M I I 



1921 AAT CAAT G GGACT CT G AT AT AAAG GAAGAAT AAGT CACT GT AAAACAGAAC T TT T AAAT G 198 0 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 2 04 0 

| | | | | | | | | | | | | | I I I I I I I I I 1 I I II I I I I I I I I I I I I I M I I I I I I I I I I II I I I I I 
1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCT^ATTAATAT 204 0 

2041 TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT G CAAAT GAGAGAG C AGT T T AGT T GT T G CAT 2100 

| | | I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2041 TAT CAC AC TAT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGC AGT T T AGT T GT T GC AT 2100 

2101 T T T T C G GAC ACT GGAAAC AT T T AAAT GAT C AG GAG GGAGTAAC AGAAAGAG CAAGGCT GT 2160 

| | | | 1 M | | | | I | | I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
2101 T T T T C GGACAC T G GAAACAT TT AAAT GAT C AG GAGGGAGT AAC AGAAAGAGCAAG GCT GT 2160 

2161 T TT T GAAAAT CAT T AC ACT T T C ACT AGAAGC C CAAAC CT C AG CAT T CT G CAAT AT GT AAC 2220 

| I I I M I I I I I I I I I I I II I I I I I I 1 I I I I I I M I I I I I I I I II M I I I I I I I I I I I I I I 
2161 T T T T GAAAAT C ATT AC AC T T T CACT AGAAGC C CAAAC CT CAGC AT T CT GCAATAT GT AAC 2220 

2221 CAACAT GT CACAAACAAGC AGCAT GTAACAGACT GGCACAT GT GC CAGCT GAATTTAAAA 2280 

I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 1 M I 
2221 CAACAT GT C ACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAATTTAAAA 2280 

2281 TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT T AAGAT CAAACCT C A 2340 

| | | || I I I I I I I I I I I I M I I I I I M I I I II I I II I I I I I I I M I I I I I I I I I I I I I I I I 
2281 TAT AAT AC T T T T AAAAAGAAAAT TAT T AC AT C C T T T AC ATT C AGT T AAGAT C AAAC CT CA 2340 

2341 CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACT T T T T T GAAT CTGT CAT T C A 2400 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I 
2341 CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACT T T T T T GAAT C T GT CAT T C A 2400 

2401 CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T T T C AGGAT TAT TAAAAT CTTCTTTTT 2460 

M I I I I I I I I II I M I II I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I 
2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 252 0 

I | | | | | | || I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I M 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2 521 CT GCAT GTAGAT GATTAAAT GAGGGCAGGCCCT GT GCT CATAGCTTT ACGATGGAGAGAT 2580 

I | | || I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I 

2521 CT GCAT GTAGAT GAT T AAAT GAGG GCAG GCCCTGTGCT CAT AGCT T T AC GAT G GAGAGAT 2580 

2581 GC CAGT GAC CT CAT AATAAAGAC T GT GAACT G C CT GGT G C AGT GT C CAC AT GACAAAGG G 264 0 

I || | | | | I I I I I I I I I I I I I I I I I I I II I M I I I I I II I I I I I I I I I I I I I I I I II I I I I 
2 581 G C CAGT GAC CT C AT AAT AAAGACT GT GAAC TGCCTGGT GCAGT GT C CACAT GAC AAAGGG 2640 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 

I I | I I I I M I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I M I I I I M I I I I I I 
2 641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 27 00 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 2760 

I I | | || I I II I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I 
2701 G CT AT AGT TAAAAT ACT AT T TT T CAAAAT C AT ACAGAT T AGT AC AT T TAAC AGCT AC C T G 27 60 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

I M I I II I I I I I I II I I I II I I I I I I I I I 1 II I I I I I I I I M I I I I I I I I I M I I I I I I I 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCT^ATAGAAAAGTTTGCTTG 2820 



2821 
2821 
2881 
2881 
2941 
2941 
3001 



ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

1 I I I I I II I II I I II I I I I I I I I I I M I I I > 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

| | M | | || | | | | M I I II I I I I Nil I M 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 



2880 
2880 
2940 
2940 



G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAGGT G 3000 

I I I M | I I I I I I I I I I I I I I M I I I I II I I I I I M I I Mill Ill 

GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 



3000 



GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

I | M | | | | | | | | I I I I I II II I M I I I II M I II I I M I I M M I I I I I M I I I I I I I I I 

3001 GG GT T GGAG GAAAC C CAT G G GGAC AGAT T C C CAT T CT T AGC CT AAC GT T C GT CAT T GC C T 3060 



3061 C GT C ACAT CAAT GCAAAAGGT CC T GAT T T T GT T C C AGCAAAACAC AGT GCAAT GT T CT C A 

M I M I I I I I I I I I I I M I I I I I I I M I I I I I I I II I II I M I I I I I II 

3061 C GT CACAT CAAT GCAAAAGGT CCT GAT TT T GT T C C AG CAAAAC AC AGT GCAAT GT T CT C A 

3121 GAGT GACT T T C GAAAT AAAT T GGGC C CAAGAGCT T T AACT C GGT CT TAAAAT AT GC CCAA 

MUM I I I I I M M M I I I I I II I I M I I I I I I I I II I I II I 

GAGT GACT T T C GAAAT AAAT T G GG C C CAAGAG CTT T AACT C GGT CT TAAAAT AT GC C CAA 



3121 
3181 
3181 
3241 
3241 



ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

| | | | || | | | | II M I II M I I I I I I I II I I I I I I I I I I I I I M II I I I I I I 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 
M | | || | | | | | | | I I I I I I II I II II I II I I I I I I I I I M I I I I I I I I I I I I I I I M I I I 
TTGTTTTCTGT CAAT ATT GAAT GT GAT GGT AC AGTAAAC CAAAAC C CAACAAT GT G GC C A 



3301 GAAAGAAAGAGCAATAATAAT TAATT CACACACCATAT GGATT CT ATTT ATAAAT CACCC 

| M I I I I I I I I I I I I I I M II I I I I I M I I I I I I I I I I I I I I I II I M I I I I M 

3301 GAAAGAAAGAGCAATAATAATTAATT CACACACCATAT GGATT CTATTTATAAAT CACC C 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 

| | M || | | | | | | | I II I I I I I I I I I I I I I I I I I M Mill 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT CACTT TTT C AGAGGC CT GT TAT C AT AGAAGT 

3421 C AT TT T AGACT CT CAAT T T T AAAT T AAT T TT GAAT CACT AAT AT TTT C AC AGT T T ATT AA 

I | | | M || || I I M I II I II I I I II I I I M II I I M II I I I II I M M II II II I I I M I 

3421 CAT T T T AGACT CT CAAT T T T AAAT T AATT T T GAAT CACT AAT AT TTT C AC AGT T TAT T AA 



3481 
3481 
3541 



TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT TTT TAT T AC CAT GT ACT GAAT TTT T AC A 

I I I I | | I I I I I II I II I I I I I M I II II II M I I I I I I I I I I I I M I I I I 

TAT AT T T AAT T T CT AT T T AAAT T T T AGAT T AT TT TT AT T AC CAT GT ACT GAAT TTT T AC A 



TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

| | I I I I I II M I I I I I I II II I M I II II II I I II II I I I I I MM 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



3120 

3120 

3180 

3180 

3240 

3240 

3300 

3300 

3360 

3360 

3420 

3420 

3480 

3480 

3540 

3540 

3600 

3600 



3601 T GAAACT ACAC ACAAAAAGCAT ACTT GCATTATTTATAATAAAATT GCAT T CAGT GGCTT 3660 

|| | || | | | I II I II I II II 1 I I I I I I I 1 I I I II I M II I II I I I I II I M M M I 

3601 T GAAACT ACAC ACAAAAAG CAT ACT T GCAT TAT T T AT AAT AAAAT T GCAT T CAGT GGCTT 3660 



Qy 3661 T T TAAAAAAAAT GT T T GAT T CAAAACT T T AACAT ACT G AT AAGT AAGAAACAAT T AT AAT 3720 

| | | | | | | | | I I I I I 1 I I 1 I I 1 I I I M II I I I I I I I I M I I I I I I I I I I I I M 1 I I I I M I 
Db 3661 T T TAAAAAAAAT GT T T GAT T CAAAAC T T T AACAT ACT GAT AAGT AAGAAACAAT TAT AAT 3720 

Qy 3721 T T CT T T ACAT ACT CAAAAC CAAGAT AGAAAAAGGT G CT AT C GT T CAACT T C AAAACAT GT 3780 

| | M I I II I I I I M I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I II I I I M I I 

Db 3721 T T CT T T ACAT ACT CAAAAC C AAG AT AGAAAAAG GT G CT AT C GT T CAACT T CAAAAC AT GT 3780 

Qy 3781 T T C C T AGT AT TAAG GACTT T AAT AT AGCAACAGACAAAAT TAT T GT T AACAT GGAT GT T A 3840 

| I | | | I I I I I I I I I I II II II I I M I I I I I I I I I I I I II M I I I I 

Db 3781 T T C C T AGT AT TAAG GACT T T AAT AT AGCAACAGACAAAAT TAT T GT T AACAT G GAT GTT A 3840 

Qy 3841 C AGC T CAAAAGAT T T AT AAAAGAT T T T AAC CTAT TT T CT C C CT T AT TAT C C ACT GCT AAT 3900 

| | | I M M I I I I I I I I I M I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I M I I I I I M I 

Db 3841 CAGCT CAAAAGAT T TAT AAAAGAT T T TAAC CT AT TT T CT C C C T TAT TAT C C ACT GCT AAT 3900 

Qy 3901 GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCT T AC AT AT GGC CAAAGGAAT AC A 3960 

| || | | | | | | | | | I I I II II M I I I I II I I I I I I I I M I I I I I M I II I I I I I I II I I I I I 
Db 3901 GT GGAT GT AT GT T CAAACAC CT T T T AGT AT T GAT AGC T T AC AT AT GGC CAAAG GAAT AC A 3960 

Qy 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 4020 

|| I I II I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 4020 

Qy 4 021 AAAAAAT TAT AT AT CT G GGAGGAT TT TT T G GT T G C CT AAAGT GGCT AT AGT TACT GAT T T 4 080 

| | | | | | | | | | I I M I I I I I I I I I I I I I I M I I I I I I I I I II M I I I I I I I I I I I I 

Db 4021 AAAAAAT TAT AT AT CT GG GAG GAT TT T T T G GT T GC CT AAAGT GGCT AT AGT TACT GAT T T 4080 

Qy 408I T T T ATT AT GT AAGCAAAAC CAAT AAAAAT T T AAGT T T T T T T AAC AACT AC CT T AT T T T T C 4140 

| | | | I M I I I I I I I I I I I I I M I I I I I I I M I II I I I I M I I II II II I I I I II II I I II 
Db 4081 T T T ATT AT GT AAGCAAAAC CAAT AAAAAT T TAAGTT T TT T TAACAACT AC CT T AT T T T T C 4140 

Qy 4141 ACT GT AC AGAC ACT AAT T CAT T AAAT ACTAAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 4200 

| | | M I M I I I I I M M I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I M I I I I M I I 
Db 4141 ACT GT ACAGAC ACT AAT T CAT T AAAT ACTAAT T GAT T GTT T AAAAGAAAT AT AAAT GT GA 4200 

Q y 4201 C AAGT GGAC AT T AT TT AT GT T AAAT AT AC AAT TAT CAAGC AAGT AT GAAGT TAT T CAAT T 4260 

| | | | M I I I II I I I I I I I I II I II I I I I I I M I I I I I I I I I I I I M I I I I I I I I M I I I I 
Db 4201 CAAGT GGACAT TAT T TAT GT T AAAT AT ACAATT AT CAAGCAAGT AT GAAGT TAT T CAAT T 42 60 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 

I I I I II I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 



RESULT 2 
AAA35162 

ID AAA35162 standard; DNA; 4286 BP. 
XX 

AC AAA35162; 
XX 

DT 28-JUL-2000 (first entry) 
XX 

DE Human adenosine receptor related polynucleotide 2nd SEQ ID NO: 36. 
XX 

KW Human; adenosine receptor; low adenosine antisense oligonucleotide; 
KW phosphorothioate; impaired respiration; inflammation; allergy; 



KW allergic disease; bronchoconstriction; inhibitor; antiinflammatory; 

KW antiallergic; antiasthmatic; cytostatic; analgesic; impaired airway; 

KW lung disease; ischaemic condition; pulmonary vasoconstriction; asthma; 

KW respiratory distress syndrome; pain; cystic fibrosis; emphysema; 

KW pulmonary hypertension; chronic obstructive pulmonary disease; COPD; 

KW cancer; leukaemia; lymphoma; carcinoma; metastasis; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200009525-A2. 
XX 

PD 24-FEB-2000. 
XX 

PF 03-AUG-1999; 99WO-US017712 . 
XX 

PR 03-AUG-1998; 98US-0095212P . 
XX 

PA (UYEC-) UNIV EAST CAROLINA. 
XX 

PI Nyce JW; 
XX 

DR WPI; 2000-205971/18. 
XX 

PT New antisense oligonucleotides useful for treating e.g. pulmonary 

PT vasoconstruction, inflammation, allergies, asthma, hypertension, 

PT bronchitis, emphysema, respiratory distress syndrome, ischemia or 

PT cancers . 
XX 

PS Disclosure; Page 1191-1192; 1343pp; English. 
XX 

CC The present invention describes a new composition comprising an antisense 

CC oligonucleotide (ON) with low adenosine (up to 15%), which targets 

CC nucleic acids involved in bronchoconstriction, allergies, and/or 

CC inflammation. The ON can have antiinflammatory, antiallergic, 

CC antiasthmatic, cytostatic and analgesic activities. The compositions are 

CC useful for the treatment of diseases associated with inflammation, 

CC impaired airways, including lung disease and diseases whose secondary 

CC effects afflict the lungs of a subject. They can be used for treating 

CC e.g. ischaemic conditions, pulmonary vasoconstriction, allergies, asthma, 

CC impeded respiration, respiratory distress syndrome, pain, cystic 

CC fibrosis, pulmonary hypertension, emphysema, chronic obstructive 

CC pulmonary disease (COPD), and cancers such as leukaemias, lymphomas, 

CC carcinomas, and cancers which may metastasise to the lungs, including 

CC breast and prostate cancer. The reduction of the adenosine content of the 

CC ONs reduces side effects. The A-containing ONs break down with the 

CC release of deoxyadenosine which activates adenosine receptors causing 

CC bronchoconstriction and inflammation. AAA32313 to AAA35312 represent the 

CC nucleotide sequences given in the sequence listing from the present 

CC invention, which correspond to SEQ ID NO:l to 2815, and then the last 185 

CC sequences are also called SEQ ID NO:l to 185, but the sequences differ 

CC from the previously named sequences. SEQ ID NO: 11 to 1680 (AAA32323 to 

CC AAA33992) are specifically claimed ONs from the present invention. N.B. 

CC Sequences given in the disclosure of the present invention do not match 

CC up with their corresponding SEQ ID NO: sequences given in the sequence 

CC listing 



XX 
SQ 



Sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 



Match 99.6%; Score 4284.4; DB 3; Length 4286; 

Local Similarity 100.0%; Pred. No. 0; 
hes 4285; Conservative 0; Mismatches 1; Indels 0, Gaps 

1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

I , , , I m I | , I I I | | | | | | | I I | I I I I I I I II II I M I I I I M I I I I M 

1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 
Ml IIMIIIIMMIIMIMIIIIIIIIIIIIIMIIIMIIIIIIMIIII 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 
GGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 



121 AGGAT C AAC AC AGT 

M I I I I I I I I I M I I I I I I I I I I M I I I I I I N I iiiiiliiiiililiij.iilii^^ 

121 AGGAT CAACAC AGT """" """"" 



GGCT GAACACT GGGAAGGAACT GGTACTTGGAGT CTGGACATCT GA 180 



181 AACTT GGCTCT GAAACT GC GGAGCGGCCACCGGACGCCTT CT GGAGCAGGTAGCAGCAT G 240 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iii mi [[[iiiiiiiliiiiiiUilUUi 

181 
241 



AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 2 4 0 
CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 



I | | | | | | | | | | | | | | | | | I I I I I I II I I II I I I I I M I I I I I I M I I I I I I I M I I I I 

241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 
301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I M I I I I I I M I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

301 TCGCGGATCTGGGGAGA^ 360 

361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I M I I I I I I I 
361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

4 2 1 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 8 0 



I I I 



I I I I I M I I 1 I I I I I I I I I I I I I I I I I I M I 1 1 I I I I I 1 I I M I Mill 



480 



4 8 1 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 



421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

540 

Ml IIIIIIIIMIIIIMIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIII 

481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 
541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

IIIIIIIIIIIIIIIIIMIIMIIIIIIIIIIIIIIIIIIIIIIIIIMUIIMIMI 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

I I I I | | I 1 | I 1 1 I I I IN I I II II MMIIIMII I INN I INI MINIMI ,,. 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 72 0 

I I I | || | || M II M M M II I II II II M I II I I I N I I I I N I II II I I M II II M I 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 
7 2 1 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 8 0 

I I I I I I | | | I I I I II M II M I II II I M I I I M I II I I I I II I M II M I II I I''''' „ fln 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 



781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 

MIIMIM I I MINIM II MIIIMII I MUM MINI MM I MINIMI I II 

781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 



900 



GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

900 



ft 4 1 GCTGTTGCTTCTTGGAtrl AiJ-f^".! i/wiuunni x «vj«^ j. - 

,,,, I I I I M I M I I I I I I I I I I I I II I I ! t MINN I I I I I I I I I I I I 

8 4 1 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 



ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



901 1U1XXX^ 4 . 

I .I i I I I M I II II II M I M N I II I M I II N N M M II II II M , ■ . 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



960 
1020 



9 6 1 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 

I I I I I I M I I I II M M M M II II I II I I II N I I N I I I II II I I I I M I M M N I I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

M I I I M M M I II I I N I I I II I I N N N I II I II M I II I II N M II N II I I M I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

I, || | MINI III IN M M INI INN Ml III I II Ml INN I I Mil I MUM 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 



1081 
1141 

1141 
1201 
1201 
1261 



AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

I I I I I I | | | | | I I I I I I I I I I 1 I I I 1 I I I I I I I I I M I I I I I M I II I I I N IN 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 
GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

I I | | || | M II II I I I N II II II I N II II I M II I M M II II I N I I N II I N I N 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



1381 
1381 
1441 
1441 
1501 
1501 



1080 
1080 
1140 
1140 
1200 
1200 
1260 
1260 
1320 
1320 
1380 
1380 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

I I I I I M M N I I I N I I I 1 I I I I 1 I I I Nil INI II MINI MINI 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I I I I I I I II N II I I II II I I I N II II II II I M II II M I I I N II II I M N N I I I 
TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 



AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

| | , ,| | | | | | | | | | | | | | M II I II II I N N I I I I M M I I I I I M M N II II I I II I 
1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

I I I I I I I M | II N II II I N II I N II II I II I I M N II N II I II N II II I II N I 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

M I II II II II I I I II II N I I I N I I I N N I I II I N I II I I I I I I I I"'' 'i 

AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 



1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

M M I II II N I I I N II M I I N I II I M I II M II N I I N I N N N I I I 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 
1621 AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 



1560 
1560 
1620 
1620 
1680 



1621 - 

1681 TAAAATATTAAGTGTAATTATTTTAACACT^ 1740 

;:: E;^=^^ - 

1,41 TTACGGCATGGAAAGAAAATCASTGGGAATTAAGA^^ 

,801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAG^TATTCACACAAf^CT I860 

;:: i;—^^ »» 

1861 taggcttaaaaatgagctc^ctcagaatttc^ i»20 
1,61 taggcttaaaaatgagctcact^gaatttctm »*> 
1,21 aatcaatgggactctgatataaaggaagaataagtcactgtaaaa^gaacttttaaatg 1980 

1,21 -C^GGAC^;;^ 1980 

1981 aagcttaaattactcaatttaaaattttaaaatcctttaa^caacttttcaattaatat 2040 

2040 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

:;: 2™^^ 

2101 ttttcggacactggaaacatttaaatgatcagc^^ 2160 

2101 UU^^ - 

2 i 6 l ttttgaaaatcattacactttcactagaagccc^cc™ 2»o 

2161 ii^^AC^CAC^ 222 ° 
2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTG^ 2280 

2221 cA^Gi^C^ ™ 
2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTA^ 2340 

2281 ISiJM^ ™ 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAG^CTTTTTTGAATCTGT 2400 

2341 c^GAGAA^^ 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTT^ "60 
2401 2460 
2 - Sffl^^ 



246! TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

252 1 CTG^TGTAGAIGATTAAATGAGGGCAGGGCCTGTGCTCATAGCTTTAC^ 2580 

25 21 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGT^ »»<> 
2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 

2581 l^J^^^ 

2641 gcaggtagcaccctctctcacccatgctgtggt— 2,00 

2641 ii^J^^^ ™ 

2701 gctatagttaaaatactatttttcaaw^ 

27 „1 GC^ii^ 

2,61 TAAAGCTTATTACTAATTTTTGTATTATTTTTGT^^ 2820 

27 61 i^^^ «™ 
28 21 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 

282i ArG^r;;;^^ ™ 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 

^^^^ - 

2941 gggatgagatgtgtgtgaaagtatgtacaagagaa^g^ 3000 

29 41 ^GA^G^A^^ ™ 
3001 =ogttggaggaaacccatggggagagattcccattcttag C ctaacgtt^ 3060 

3001 GGG^GAaIcc^ 3060 
,061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCT^ 3120 

rAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

3121 El;;^^^ - 

.181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3il »" 

„ 41 ttgttttcTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 

;;;; sk^^^ »» 

3301 3360 
330! gIaagIXg^^ " 60 



3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 



3361 AC 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATC^ 3480 

3421 C -;; A ^^ -° 
3481 1= ™ ;;;; 

3481 TATATTTAATTTCTATTTAAATT 3540 
3.41 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

3601 TGaIaCTACACACAAAAAGCATACTTGCATTATTTATAATAAA 3660 
,661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 3720 

3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 3760 

372 i ^^i^^ ™ 
3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTCTTAACATGGATGTTA 3840 

378 1 38,0 
3841 GAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCyTATTATCC 3900 

3841 i^J^iiiiM »»» 
3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCA^GGAATACA 3960 

390 1 Mjmm^^ 3960 

3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGT 4020 

3961 GTTTATAGCAAAA^T^GTATGCTGTAGCTAA -0 

4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 4080 



408 1 - -A^^^ "« 
4141 ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 4200 

114 1 ^AGAC^^ -0 



Qy 4201 CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCA^ «60 

4201 CAAGTGGACATTATTTATGTT 4260 

Ov 4261 AAAATGCCACATTTCTGGTCTCTGGG 42 86 

Y M I I I I I I I M I M M I I I M I I I I I 

Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

RESULT 3 
AAF21284 

ID AAF21284 standard; DNA; 4286 BP. 
XX 

AC AAF21284; 

XX 

DT 14-MAR-2001 (first entry) 

S Human low adenosine antisense oligonucleotide related sequence #2851. 
M Low adenosine antisense oligonucleotide; . P hos P horot ^°^ at ^? rgy; 

S "Sct.nAypoproduction; p»l„on«y vasoconstriction, as ttaj, »M. 

5 SSSSSSSSnsmsam 

KW cancer; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200062736-A2. 
XX 

PD 26-OCT-2000. 
XX 

PF 24-MAR-2000; 2000WO-US008020 . 

XX 

PR 06-APR-1999; 99US-0127958P. 

XX 

PA (UYEC-) UNIV EAST CAROLINA. 

PA (NYCE/) NYCE J W. 

XX 

PI Nyce JW; 
XX 

DR WPI; 2000-679539/66 
XX 
PT 

PT — . 

PT and respiratory obstructions. 

XX 



Low adenosine (A) content antisense oligonuc leotides wh 

adenosine receptors during metabolism, useful e.g. for treating cancers 



PS Disclosure; Page 1273-1274; 1592pp; English. 

The presen t invention describes low adenosine (A) content ^isense 
CC oligonucleotides and compositions (I) comprising them. I s 
CC oligonucleotides the A is replaced by a •Universal' <V * ™£«sic 
CC TD can have respiratory, bronchodilator , antiinflammatory, analgesic, 



cc ^suppressive, antiasthmatic ^ot ensive ^^^J^^ 

The antisense oligonucleotides and I an . a „ or i at ed with 



CC 
CC 

cc 
cc 

CC 

cc 
cc 
cc 
cc 



„£7™ nd or = v Ity or target polypeptides associated with 
lung/respir.tory disorders and malignancies, such as stimulating and 

CC I «r CNS and peripheral nervous and non-nervous system peptide 

CC "Emitters, defensins, growth factors, vasoactive peptides and 

surfactant hypoproduction which are associated with a disease or 
condition selected from pulmonary vasoconstriction - a t 



CC 
CC 

cc 
cc 

cc (RDS)? ^in? cystic fibrosis (CF) , allergic rhinitis (AM , pulmonary 

CC 

cc 
cc 
cc 

CC the present invention 
f Q sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 

Query Match 99.6%; Score 4284.4; DB3; Length 4286; 

Best Local Similarity 100.0%; Pred. No.0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0, Gaps 
1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 
Qy 1 GAGACATTCCGGTGG ,,,,,,,,, ,,,,,,,,,,, | ,,, | | , | | I 1111 

Db 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGC^ 



hv*e tension S cnronlc obstructive pulmonary disease (COPD) , 

transmutation rejection, pulmonary infections, bronchit«, 
and/or cancer. AAF18434 to AAF21543 represent human polynucleotide 
fragments and antisense oligonucleotides used in the exempl.f xcatxon of 



fil AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 
Qy 61 AGG TA Gt fCATTTGCCC ,,,,,, | | , | | | | | | | | | 



60 
60 
120 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii M i mi 1 1 1 1 1 1 1 mm nil i m i:;;;: T - nAGGCCCCCGTG G 120 



180 



Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

Qy 12 i -™a™ 

121 aggatcaacacagtggctgaacactgggaag 18° 



240 



Qv 181 aacttggctctgaaactgcggagcggccaccggacgcct^ 

Y 11 1 I I I I I I II I II I M I II I I I I I I M I I I I I I I I I I I M I I I M I I II I I I M I I I I 

181 iicUGGciciGAiACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 



Db 

Qy 



241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

I || | M I I I I I M I I M I I I I I I I I I M M I I I M I I I I I I I I M II I M I I I I I I Ml I 

Db 241 ciGCCGCCTCC^GTciGiGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

Y l I I I M I I I I I I I I I I I I M I M I I I I M I I II M M I M M I I I M I I M I I Ml Ml I 

Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 



Qy 



361 



ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 



300 
360 
360 
420 



361 ACCGCAGAGATAATGACGCC^ 420 



421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACG^ 480 

, ■ , M | | I | | M | | | | I I I I II I I I M I II I I M I I I M I I I II I I I I I I I 

421 ciGGCGCGGicGiiGGciccUcGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 



481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGAC^ 540 

I l l I M I I I I I I I I I I I I M M I I I I I I I M I I I I I M I I I I M M I I I I I I M I I I Ml 

481 ccicGcicCATclcCCcicCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 



541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCT ^00 

I I I I I I I I | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I 
541 TACATCAACACGGTTGTGTCC^ 600 

660 



601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GCAT GC GAAAC GGT C C C AAT AT CT T G AT CGC C 

I I I I I I I I I I I I i i I | | | | | M I I I I M I I I I I I II I I I M I I M I I I I M I I I II M 
601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGG ^60 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATT^ ^20 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 1 1 I 11 1 M 1 ' ' M M ' M ' ' ' „™ 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCA 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I l l I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I M I M I M I I I II M I I I I I I 
721 CTGCTGGCAGAGGACT 780 
781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTC 840 

I I I l I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I M I I I I I I I M I I I I I I M I I I I I 

781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTA 840 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCACT 900 

I I I l I I I I II I M I I I I I I M I I I I I I I M I I I I I I M I I I I I I I I I I M I M M I I I I I 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAA^ 900 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAG^ 960 

901 i TT GTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTC 1020 
i i I I I l I I I I I M I I I I II I I I I I I I I I I I I I I I M I I I I M II M I I I I I I I M I I M I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTG ™° 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCT 1080 

i i I I I i I I I I I I I I I I | I I I I II M I II I I I I I I M M M I M M I II M 

1021 AAGACAGCT T T CAT GCAGT T TT ACAAGACAGC AAAAGAT TGGTGGCT GTT C AGT TT CT AT 1080 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 
1081 TTCTGCTTGC |||||||M||||| |,, | | ,, | | | | | | | | | | | | 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACAC 

1141 AGAAAGAAAAGTGGCAT GCAGATT GCT TT AAAT GAT ^^^^^^'^'^^^'? < f ? ? 12 °° 

I I I I I I II I I I I I I I I I I I I I I M I M I I I I I I I I I M I I I I I M II I I I I I I I M I I I I 

1141 AGAAAGAAAA.GTGGCATGCAGATTG 1200 
1 9 01 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

?M^MIIMIIMIMIIIIIIMIMIIIIMIIIIIIIIIIIIMMIIIMII 



1201 



GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 12 60 



1261 AG C AGG AT T CT GAAGCT C ACT CT T T AT AAT C AGAAT G AT CC C AAT AGAT GT GAACT TT T G 

Ml I | 1 | I 1 I I I I 1 I I I I I I I 1 I I I I I 1 I I I I I I I I t ! I 1 INN 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

| | | | | | | | 1 | 1 I I I I I 1 1 1 I I I I I I I 1 I I I I I 1 1 I I I I ■ ■ 1 1 I ■> 1 '11 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

! I I | | M I I I I I I I I I I I I I I I I I M I I I II I I I M I I I I I I M I I I I I I I I M I I I I M 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1321 
1321 
1381 
1381 
1441 



TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| | | | | | | | | | | | I I I I i M I I I i I I I II I I I I I I I I I 1 M I i ! M I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 



1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

I | I I I I I I I II I M I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I M I I I I I 

1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 



1320 
1320 
1380 
1380 
1440 
1440 
1500 
1500 
1560 
1560 
1620 



1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

HI M I I I I I I I I I I I M I I I I I I I I I M I I II I I I I I I M I I I I I 

TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 162 0 



1561 



1 62 1 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACT AT GTATTT GCACAGCACACT AT 
| | | | M I I I I I I I I I I I I I M I I M I I I I I I M I I II I I I I I I I I I I II I I I I M I I I I I 
1 62 1 AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 



1680 
1680 
1740 



1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 
M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 



1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 

IIIIMIIIIIIIMIIIMMMIIIIIIIIIIIIIIIIIIIMIIMMIIIIIIIII 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 



1801 
1801 
1861 
1861 
1921 



TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 
| I | | I | | | | | M I I I I I I I I I I I M I I I I I I I M I I I M I I I I I I I I I I I I I M II I I M 
TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

I I I I I I I I | M | M I I I I I I I I I I I I I I I I I I I I I M I II M II I I I I I I I I I 

TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 192 0 



1800 
1800 
1860 
1860 
1920 



AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 

| | | | M I II M I I II I M I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I 

1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 



1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

MUM MM MINIMI I II I II I II II I II M II I M M I M I I I I I MM 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

| I || | | || | M || || II I II II II I I I M I II I I M I I M I M M II I M M I I M II M 
2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 



1980 
1980 
2040 
2040 
2100 
2100 



^ 01 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 
2101 TTTTCGGACACTG , , , , , , , , , , , , , , , | | | | , , | , | | | | | 

2101 ttttcggacactggaaacat^ 2160 

2161 T T T T G AAAAT CAT T AC AC T T T C ACT AG AAGC C CAAAC CT CAGC AT T CT GCAAT AT GT AAC 2220 
2161 TTTTGAAW^TCATTACACTT^ 2220 
2221 C AAC AT GT CAC AAAC AAG C AGCAT GT AAC AGACT G GC AC AT GT GC C AGCT GAAT TT AAAA 2280 

... I I t I | I I I 1 I I 1 I 1 I t I I I I I I I I I I I I 1 1 1 1 I I I MIIIIMIMIIMI 

2221 C AAC AT GT cJvC AAACAAG C AGCAT GT AAC AGACT G GC ACAT GT G C C AGCT GAAT TT AAAA 2280 



1 AAAAAGAAAATT AT T AC AT CCTT T AC ATT CAGT T AAGAT C AAACCT C A 2340 

2340 



o 0 ft 1 TAT A AT ACTT T T AAAAAGAAAATT AT lAtia o<-i ± j.«.^x x v-™ x ^ - ™ ~ 

2281 TATAATACTTTTAAAAA ,,,,,,,,,,,,,,, , , , | | | , | | | | , , | | 

2281 TATAATACTTTTA^ 



2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCT 2400 
2341 CaIaGAGAA^ 2400 



2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATT 2460 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I 

.CAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 



2401 CATACCCTGTGAAGAi 

'TT - 

' 1 1T1TTT TT 7T 1 1 1 1 1 1 1 u 1 1 1 m 1 1 1 m 1 1 1 1 m m 



2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACC™ 2520 



2461 

2521 CTGCATGTAGATGA' 



2580 



TTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

, I . , ,,, i i i I ! I M | I | 1 | I || I I I I I I I M I 1 I I I I M I I I I I I M 1 I I I I I I M I I t 

2521 CT GCATGTAGAT GATTAAAT GAGGGCAGGCCCT GT GCTCATAGCTTTACGAT GGAGAGAT 
2581 GC C AGT GAC CT C AT AAT AAAGACT GT G AACT GC CT G GT GCAGT GT CCACAT GACAAA.G GG 2640 

I I I I i I I I I I I i I I | | | | | I I I I I I M I I I I 1 1 I I I I M II I I I I I I M I I I I I I I I I I I 

2581 GCCAGTGACCTCATA 2640 
GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 



2821 
2821 
2881 
2881 



TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

ii,i,,i IIM iiiiimiiiiiiiiiiimim iiiiiiiiui i iiini N 

TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAG 2820 
ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

!milllllllMMMMIIIIIIMIIIIMIIIIIIIIIMIIIIMIMIIIIII 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 
AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

tTTTlTl | I I I I 1 I I I 1 I I I 1 I I I M I M I U I l 1 1 I 1 II I I I I II l I II M I I I Ml I I 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 



2880 
2880 
2940 



GGGATGAGATGTGTGTGAAAGTATGTAC^AGAGAAAACGGAAGAGAGAGGAAATGAGGTG 3000 

:;: r=^=^^ - 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCA^ 3060 

300, gGgUgGAG^^ »» 
3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 

«;g^^ - 

^191 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA. 3180 

::: hl™^^ - 

.181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

Z Hi^^^ »« 

321! TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 

1 ^G^ii^^ 3300 
3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 3360 

3 301 »« 
3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

33I Efe^^ 3420 

^91 rATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 

===^ 3,0 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 3540 

«;; c ^^ - 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

s™^^ *» 

3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTG^ 3660 
3601 ^ACA^i^^^ 

3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 3120 

366 1 »» 
3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTC^ACTTCAAAACATGT 3780 

3721 ^l^^^ 37SO 
3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 3840 



i i i l l l I I I I I I I II I I I I I I I I I I I M M M IN I I I M M M U H 1 M I I M M I I I 

3781 TTCCTAGTATTT^GGACTTTA 

TATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 



3841 CAOCTC^TTTAT^TT, ; ^. ;?;;;;T; ^ T;;T;rn ^ 

Db 3841 



3840 
3900 



GT GGAT GT AT GTT CAAACAC CTTTTAGT ATTGAT AGCTTACAT ATGGC CAAAGGAAT ACA 

M l l I I I I I I I I I I I I I M I I I I I I I M M M I I M I I I I M I I I I I II II I I I I I Ml I 

GTGGATGTATGTTCAAACACCTTTTAGTATTGAT 



Db 3901 
Qy 3961 
Db 3961 

4021 AA7\AAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGOT 4080 

402i jjjjjj,;— 4080 



GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTCT 4020 
I I I I I I I I | I I I I I I I I | I I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I 
GTTT AT AG C AAAAC AT GGGT AT GCT GT AG CT AACT T T AT AAAAGT GT AAT AT AAC AAT GT 4020 



4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 4140 
Qy 4081 TTTATTATGTAA ,,,,,,,,,,, ,,,,,, | | | ,, | | | | | | | | | | | | | | I 

4081 TTTATTATGTAAGCAAAA^ 4140 

.141 ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 4200 
Qy 4141 ACTGTACAGACACTAA ,,,,,,,,,,,,,,,, , , , , | | | M | | | | | | I I I I II 

AC TGTACAGACA^ *200 



Db 4141 



4201 CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTT 4260 

yy I I I | I I I i i I I i I | | | | | | | I I I I I I I II I I I M M I I M I I I I I I I I I I 

4201 CAAGTGGACATT™ 4260 



QY 



4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

| | I I II I I I I I I I I I I I I I I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 4 
ABV94186 

ID ABV94186 standard; cDNA; 4286 BP. 
XX 

AC ABV94186; 
XX 

DT 08-JAN-2003 (first entry) 

DE Breast carcinoma related nucleotide sequence SEQ ID NO: 177 . 

S Human; breast carcinoma; cancer; tumour; cytostatic; anti-tumour; gene; 

KW ss . 
XX 

OS Homo sapiens. 
XX 

PN WO200246467-A2. 
XX 

PD 13-JUN-2002. 
XX 

PF 07-DEC-2001; 2001WO-IB002811 . 
XX 



cc 
cc 
cc 



PR 08-DEC-2000; 2000US-0254090P . 

PR 07-DEC-2001; 2001US-00007926 . 
XX 

PA (IPSO-) IPSOGEN. 

£ Bertucci F, Houlgatte R, Birnbaum D, Nguyen C, Viens P, Fert V; 

XX 

DR WPI; 2002-619023/66. 

It Novel polynucleotide library useful in molecular characterization of » 

S carcino^ composing , P^ n « esldln tu„=r 

PT subsequences which are either underexpressea uj: 

PT cells. 
XX 

PS Claim 1; Page 225-226; 401 PP ; English. 

CC The present invention describes a polynucleotide library (I) useful in 

S ^noii^g^^^ 

CC <"» f b^^^rcSSSS :lolyn"i:otSe"Ule fro m 

thP nolvnucleotide sequences of (I) or its expression 
or the poiynucitiULiu ^ H^tectina the reaction product. 

cc ^, P ^re U cyt:^atirS"tief 1^ can'of use^s . ' are' 

cancer treatment, and for detecting differentially expressed genes that 



CC 

cc 
cc 
cc 
cc 
cc 

CC correlated with a cancer 



SQ Sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 

Query Match 99-6%; Score 4284.4; DB6; Length 4286; 

Best Local Similarity 100.0%; Pred. No.0; 

Matches 4285; Conservative 0; Mismatches 1, Indels 0, b p 

;agacattccggtgggggactctggccagcccgagc^cgt^ 



QY 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 



60 
60 
120 



AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGT^ 
AGGTAGGCATTTGCCCCGGTGGGACGCCTT 120 



180 



AGGATCAACACAGTGGCTbA^^i^^^^^^— — ^ 777 7 7.7 i , i i i i i i I I II I 

I I I 1 ! I 1 | I I I I I 1 I 1 I I t I 1 I I I I t I I 1 I 1 I 1 I I 1 I I I 1 1 I 1 t I I ! I 1 I I 1 I I t I I 1 I I 



121 AGGATCAAC ACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CTGGACAT CTGA 1 8 0 
181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 2 4 0 

1 1 1 1 1 1 1 1 ii ii ii iiiiliiiliiiiiiUiUiU' 

181 " ~~*~~~~™-*^r.*r.r.*wr. 

241 



AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | | | | | | | I | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 4 1 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 3 0 0 

301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I I | | I I I II I I I I I I II I I I II I I M I I I II I I I I I I I I I M I I I I I II I I I I 

301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 
361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

IIIIIIIIMMIMIIIMIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIII 

361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 42 0 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 8 0 

| | | | | | | | M I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

4 2 1 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 8 0 

481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 54 0 

M I I I I I I II I I II I I I I I | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 8 1 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 5 4 0 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 6 0 0 



721 



781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 

I I I I I I I I I I II I I I I M I I I I I M I I I I I II I I I I I I I I M I I I I I M I I M I I I I I I I 

781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

I | | | | | | M | | | I I I I II I I I I I I M I I I I I I I M I M I I I I I M I I I 

841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 
901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

MMIIIIIIIIIIIIIIIIIMIMIIIIIIIIIIMIIIIIIIIIIIIIIIMIIIII 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



780 
840 
840 
900 
900 
960 
960 



961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 



I | | | | | | || | | | | I I I I I I I I M I I I I I I I II I M I I I I I I I I I I I I I I I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 



1020 



1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

|| | | | | | | I I II I I I I I I I I I I I I I I I I I M I I I I M I I I I ' l I I I I 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 



1081 
1081 
1141 
1141 
1201 
1201 
1261 
1261 
1321 



1080 
1080 
1140 



TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

I I I I | | | | I M I I I I I MINI I I I M I I I I I I I I I I I I I I II I I I I I I I 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 
AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 in ii i nun in inn i 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 



1200 



GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

I | | | | || | | | | | I II III I I I I II I I I I I I I I I I I I I M I I I M I III I I I M II I III I 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 
AGCAGGATTCT GAAGCT CACT CTTTAT AATCAGAAT GATCCCAAT AGATGT GAACTTTT G 

I || | | | | | | | | | | I || I I II I II I I I I I I II I I I M I I Ml I I M I I I Ml I I I I 

AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 



AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

I | | | | | | | | | | | || II I I I I I M I I M I I II I III I II III I I Ml I M II I I I 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

I I I I I I I I | | M I II I II I I I I I I II I II III I III I M I II M III I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 
1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

I I I I | M II II I II II I I II I II I II I M I I I I I I I M Ml M M II II I I I M I II I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 
1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

I M m || || M | Ml I I M I I II I I I I I M I I M M II M I I M I I I I I I I I M I II II I 

1501 AAGT T C AAAGCT AAT GAT C AC GGAT AT GAC AACTT C C GTT C CAGT AAT AAAT ACAGCT C A 
1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

I I I I I I I I I I I M I I I I M I II I I I II I II III I M M II I I M I III I I I M M II I M 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 
1621 AAC AAAAT G AAAC AT T T G C C AAAAC AAAAC AAAAAAC TAT GT AT T T GC AC AG C AC ACT AT 

I | | | | Ml M II II III I M II I I I M I II M I I I I I I I II II I II II I II I I M III M 

1 62 1 AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 
1681 T AAAAT ATT AAGT GTAAT T ATTTT AAC ACTCACAGCT ACATAT GACATTTT AT GAGCT GT 

I M || | || | | | M I I I I I I I I M II I I I I M I II II I I I I III I I I Ml 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 



1260 

1260 

1320 

1320 

1380 

1380 

1440 

1440 

1500 

1500 

1560 

1560 

1620 

1620 

1680 

1680 

1740 

1740 



TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 



1741 + + , ^ 

I M II II I I I I I II II II M I II M II I MM II II M I M m m M I M M m,"' 

1741 



TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 



1801 TTTTTACAGTTAGCACTT CAACATAGCT CTTAACAACTTCCAGGATATTCACACAACACT 1860 

|| | | || || | || || || | | I II II I II II M M I I I I II M I I M II II I I M II I M M M 
1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 



1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

I MUNI I Mill Mil I llllll I I II Illlllllll Mill 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 



1920 
1920 
1980 



1921 AAT C AAT G GGACT CT GAT AT AAAG GAAGAAT AAGT C ACT GT AAAAC AG AACTT T T AAAT G 
M I II II I I I I II I I I I M I M I I I I I M M I I I I II I M M I II I M I M I I II M I II 
1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 



1981 
1981 
2041 
2041 



AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

i || || I I M I I Ml II I I I M I I II I I Ml I II I Ml I II I Ml II I I 

AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 
TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

I | | | | | | | M II II I I M M I I II I I I II I I M Ml I I I I M II II I II M M I 

TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 



2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

M M II I I II I I II I I Ml I Ml I I I I II I M I M MM I M II M I II I Ml I I MM I 
2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 

I I M I I II II I I M M M I II I II M II I II M M I I II II I M I II II II I I I M II II 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 
2221 C AACAT GT C AC AAACAAGCAGC AT GT AAC AG ACT GGC AC AT GT GCCAGCT GAAT T T AAAA 2280 

I | | | | | || || || I I M II II I M III I II II II M I M I II II II I I II I I 

2221 C AACAT GT C AC AAACAAGCAGC AT GT AAC AGACT GGC AC AT GT GC C AGCT GAAT TT AAAA 2280 



2040 
2040 
2100 
2100 
2160 
2160 
2220 
2220 



2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 

| || | | | || | || M I I I M I II I II I I I M II I II M I II I M M II II II II I II 

2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 
| | | | I II I II II I II M II I M I II II II II II II II I I I I II M I II II I M I MM II 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 

MIlllMIIIIMIIIIIMIIMIIIIIMIIIIIMMIMMIIIIIMIIIIMI 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

| | | | | || | | || | | M I II I I I I M II II I M I II II I M II II I M I I II I M II I I I I I 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

II llllll M Mill II II I II II I I II I I I II II II I M M II INN 

CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 



2521 
2581 
2581 
2641 
2641 
2701 



GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

| | | | | | | M | | I II I I I II I M I II II I I M I I I M II I I M M M I II M M II I I III 
GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

! I I I I I I I I I I I I I I I I I I M I I I II I I M I I I 1 I M I I I I I M MMM 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 
GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 



2340 
2340 
2400 
2400 
2460 
2460 
2520 
2520 
2580 
2580 
2640 
2640 
2700 
2700 
2760 



2701 GCTATAGTTAAAATACTATT^ 2760 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 
2761 TAAAGCTTAT AO i i i i | | | | | | | | | | | | | | | | | | | | I I II I I I I I I M I I M I I M I I 

2761 TAAAGCTTATTACTAATTT^ 2820 

2 821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 
2821 ACATGGTGCTTTTC ,,,,,,,,, , , , , , , | | | | | | I I I I M I I I I I I 

2821 ACATGGTGCTTTTCTTTCATCT^ 2880 

2 8 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 
2881 AGCTTTGTGCGTTCC ,,, ,,,,,,,,,,,,,,,,, | ,,, | | | , | | , | | | 

2881 AGCTTTGTGCGTTCCTGC^ 2940 
2941 GGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGAGAGG^ 3000 

2941 GGGAT GAGAT GT GT ~ ~ ^^hp^ 

3001 



GTGAAAGTATGTACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 3 0 0 0 



GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCT 3060 
I I I | | | | | I I I I M | | | | | I I M I I M I M I I I I N I I I I I N I N I I II M I M I I I M 
3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGT 3060 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTC 3120 
i I I l I I I I I I M I I I I I I I I I I I I I M I I I I I I I M M I I I I I I I I I I I I I I I I I 1 1 1 1 1 , 1on 
3061 CGTCACATCAATGCAAAAGGTCCTGATT 3120 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCT^ 3180 

I I I | | I I I I I M | I I | | | | | | | | | | I I II I I I I I M I I I I I I I I I I o-,on 
3121 GAGTGACTTTCGAAATAAM 3180 

^1 81 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 
3181 ^™ACTTTGTTT m | | n, | | M I I I I I I I I I M M I I I I N I I I I M I I I I I I M 

3181 ATTTTTACTTTGTTTTTCTTT^ 3240 
3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 

iilllllllllllllllllllllllMIIIIIIIIIIIMIIIMIIIIIIMIIIIIII 

3241 TTGTTTTCTGT C AAT AT T GAAT GT GAT G GT ACAGT AAACC AAAAC CC AAC AAT GT GGC C A 3300 
3301 GAAAGAAAGAGC AAT AAT AAT T AAT T C ACACAC CAT AT ^■^j' ^ ?T^TTT^T^^^T ^\^\ ^ ? 

i i i I I l I I M I I I I I I I I M I I I I I I M M I I I I I I I I I N I M I I I I I I I M I I I I I I I 

3301 GAAAGAAAGAGCAATAAT^ 33 ^ 

3420 

TuMlTl I M I iTTl I I I Ti I I INI I I I I I M I I I I I I I I I I I I I I I I ' 



3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 

3361 ACAAACTTGT | | | | | | | | I I I I I I M I I I I I I M I I I I I 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTT 3420 



3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 

I I illllllMIIIIIIIMIIIIIMIMMIIIIIIIIIMIMIMII I MINIM 
3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 



3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 3540 

III i I I I I | | M I | | I I I II I N I I M I II II I INN Ml IN IN 

3481 TATATTTAATTTCTATTTAAATTTTAGATT 3540 
3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 



I I I I I M I I I 



I || | | | | I M I II II I I M I II I I I I I I I N N I I I I 



TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 



Db 3541 

Qy 3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCAT^ 

Db 3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 



3660 
3660 
3720 
3720 
3780 
3780 

Ov 3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA. 3840 

W niuiililllMMIIMIMMIIMIIMIIIMMIIIIIMIIIIIII 

Db 3781 



0v 3661 TT T AAAAAAAAT GTTT GAT T C AAAACT T T AAC AT ACT G AT AAGT AAGAAAC AAT TAT AAT 

Db 3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 

nv 3721 TT CT T T ACAT ACT CAAAAC C AAGAT AG AAAAAGGT G CT AT C GT T C AACTT C AAAACAT GT 

UY | I M | | | | | | | M | | | | I I I II I I I I I M I I I I I I M I I I I I I I I I I I I I I I I M I I I I I 

Db 3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 



TT C CT AGT AT T AAGGACT T T AAT AT AG C AACAGAC AAAATT AT T GT T AAC AT G GAT GTT A 

Oy 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

27 I | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

Ov 3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

QY iiiMilMIIIIMMIIIIIIIIIIIIIIIIIIIIIIIMMIIMMMIMIIIII 

Db 3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

Ov 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

QY | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 

Db 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

Ov 4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

Y IIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIMIIIMIIIIIIIMIIIIII 

Db 4 021 AAAAAATTATATATCTGGGAGGATTTTTT GGTTGCCTAAAGT GGCTATAGTTACTGATTT 



3900 
3900 
3960 
3960 
4020 
4020 
4080 
4080 
4140 



Qy 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

Y | | I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 4081 TTtItTATGTAAGCAAAACCAATAAAAATTTAAGTTTTT 4140 

Qv 4141 ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 4200 

Y MMIIIIIIIIIIIIIIIIIIMIIIMIMIIIIIIIIIMMIIIIIMIMIIIM 

Db 4141 ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 4200 



CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 4260 

4260 



4201 CAAGTGGACATTATTTAT^l i/w\J.^.i^«n.x x^j.^™^^ - .w^. ~ 

Y I I M I I I I I I I I I I M I I I I I I I I I I M II I I I I I I I M I I I M I I I I I I M I M 

Db 4201 CAAGT GGACAT TAT T TAT GT T AAAT AT ACAATT AT CAAGCAAGT AT GAAGT TAT T C AATT 



Qv 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

I I M I I I M I I I I I I M I I I 

Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 5 
ABZ96978 

ID ABZ96978 standard; DNA; 4286 BP. 
XX 

AC ABZ96978; 
XX 



17-OCT-2003 (first entry) 



XX 
PT 
PT 
PT 



XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



Human; antisense; lung dysfunction; nasal airway dysfunction, ■ c . 

antiinflammatory steroid; ubiquinone; antiinflammatory ; antiallergic, 
antiasthmatic; hypotensive; immunosuppressive; cytostatic; gene therapy, 
antisense aene therapy; respiratory; lung; adenosine sensitivity, 
adenoSn; raptor; bronchodilation; bronchoconstriction; lung allergy; 
lung inflammation; respiratory disease; ds . 



DT 
XX 

DE Human nucleic acid sequence 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
XX 

OS Homo sapiens. 
XX 

PN WO200285308-A2 . 
XX 

PD 31-OCT-2002. 
XX 

PF 23-APR-2002; 2002WO-US013135 . 
XX 

PR 24-APR-2001; 2001US-0286137P . 
XX 

PA (EPIG-) EPIGENESIS PHARM INC. 
XX 

PI 



Nyce JW, Li Y, Sandrasagra A, Katz E, Pabalan J, Aguilar D; 



PI Miller S, Tang L, Shahabuddin S; 



XX 

DR WPI; 2003-229219/22, 



Pharmaceutical composition for treating ailments associated with impaired 
respiration, has oligo(s) antisense to specific gene(s) or its 
corresponding RNAs , and glucocorticoid or non-glucocorticoid steroid or 
PT ubiquinone. 
XX 

PS Disclosure; SEQ ID NO 12220; 872pp; English. 



The invention relates to a novel pharmaceutical composition, which has a 
first active agent comprising an oligonucleotide antisense to the 
nSatfon codon, coding region, 5- or 3- ^.^'^tZTf' 
5- and 3' intron-exon junctions, or regions within 2-10 nucleotides of 
junctions of genes encoding a polypeptide associated with lung and/or 
nasal airway dysfunction and a second active agent comprising an 
^"inflammatory steroid and ubiquinone. A composition of the invention 
has antiinflammatory, antiallergic, antiasthmatic, ^P^nsive 
immunosuppressive, and cytostatic activity The composition may have a 
use in anksense gene therapy. The composition is useful for at n or 
preventing a respiratory, lung or malignant disease or condition, also 

for enhancing the prophylactic or th « a P eutic / eS P lrat °^^^na JevSs 
antiinflammatory steroid in a subject, for reducing or deplet ing levels 
of or reducing sensitivity to adenosine, reducing levels of adenosine 
receptor producing bronchodilation, increasing levels of ubiquinone or 
Jung surfactant in a subject's tissue, or treating bronchoconstriction, 
lung Inflammation, lung allergies, or a respiratory ^'-"-^^ 
Note: The sequence data for this patent is not represented in the Panted 
^ specification, but was obtained in electronic format directly from WIPO 
CC at ftp.wipo.int/pub/published_pct_sequences 



XX 
SQ 



Sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U, 0 Other; 



Match 99.6%; Score 4284.4; DB 7; Length 4286; 

Local Similarity 100.0%; Pred. No. 0; 
hes 4285; Conservative 0; Mismatches 1; Indels 0, Gaps 0, 



61 
61 
121 
121 



GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

mMMIMIIIIMIIIMIIIIMIIMI Mil I II ''Ml 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

, , I I | | M | | | | | | | | | I I I I I I I I I I M M M I I I I I M I M I M I I I I M M I 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 12 0 



AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 

| I I | | | | || | | || || | M M I II I I I I I I I I II I I II I M I I I I I M I I I M I I I 

AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 



180 



180 



CTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 24 0 



181 
241 



181 AACTTGGCT . Ml 

I I I M II II I I I I I I M I I I I I I I I I M I I II II I I I I M I M I I M I I M I M I I . i i 

AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 
CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I U I I I I I M M I I I II I M I M I I I M I M I I I M I M I I 

241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 
301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 
M I M I I I II I I I I I M I M I I I I I I liiililillilillllilillij.J.UUiiii 

301 --.~~-~~-»~mr.r. 
361 



TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 3 6 0 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

, , . | | || || | | || | | | | | | | | | || II I I M I I I I I II I M M II I M I II II II M I M I 
361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 8 0 

| | | M | | | | || | | | | || | || | | I II I II I II I I I I I I I I I I I II I M M I M I 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

540 



4 8 1 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

I I I I I I I I I I M M II I I I II I I I M I I M I I II I I M I M I II I M M M^MM 1 1 



540 



4 8 1 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
I | | | || | M M II I I I M M I II I I I I I I I M M Ml MM I Ml Mm M 



541 
601 



TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

I | M | | | | || || | | | | | | | | | | | | | | | M II I II I I M I I II I I I M I I II I M M M II 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

I I || M M I I I M M II I I I I I I M I I I I M I M M M I I II M I II II I ' ' ' „. 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

| | M I I I I I I I I M I I I II I II M M I II II M M M M II M I I I I I M M M M I I I I 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 



781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 

I I I I I I I I | | | I I I I I I I I I I I I U I I I I I M I I I I I M I I I I I I I I I I I I II I I I I I I I o ^ n 
781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 

841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

I I I I I I I I I I I I | | | | I I I M I I I I I I I M I I I M I II I I I I I Ml 

8 4 1 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 
901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

MMMIIIIIIMIIMIIMIIIIIMMMIIIIIIIIII Ml MINI Ml 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 

I | I I I I | | | | | M | I || I I I I I II I I I II I I I I I I I I M I I I I I M M I I M I I I I I M I 
961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

,| I M I I I I I I II N I I I I II I I I! I I I W I I I I I I I I I I I I I 1 I I I I I I I I I I 1 I 1IM 

1021 AAGA.CAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 



900 
960 
960 
1020 
1020 
1080 
1080 
1140 



1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

, , i I I I I I | m I I I I M I I I I I I I I I I I I M I M I I I I I I I I I I II I I II I II I I 

1081 UcUcUgCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 



1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ■■ 1 1 1 ■> 1 1 1 1 1 1 1 1 1 

1141 aGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

. I i i I I I I I I M I I I I I II I M M M I I I I M I I II M II I I M I I I I M I I I II I II I I 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 
I | | | | | I II I M I I I M I I I M I II I II I I I I I I I I I M I I I I I I I I I I M I I I I I I Ml 
1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

i32i ™ millimlllllllll iiiiiiiniiinim 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1200 
1200 
1260 
1260 
1320 
1320 
1380 
1380 
1440 



AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

I I I i I I I I I I I I I I | I I M I I M II I M M I II I I M I I I I M II M I M II I M II M I 

AACCCAATTGCTCTGTAT^ 1440 
TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I I | | | | | | | || I II I II M I I M II I M M I I I I M II M I II M II M M II II M II I 
TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 



1501 



AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 



I I II I I I I I II I M II M I M M I I I M I M M I II II M I II I I II M M M II II II 

1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 
1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

I I I I I I I I I I | | I I M II I M II II II II M II I II M I M M I I II II M I II I MM' 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1 62 0 
1621 AAC AAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GT ATTT GCACAGCACACT AT 1680 



1621 JJJJJJJ^^^ 1680 
1681 TAAAAT ATT AAGT GT AAT T ATTTT AACACT (^CAGCT AC AT AT ^^^TT^T?^??T ?T 

I I I I I I I I I I I I I I I I M I I I I I I I M I I I I M I I I I I I I I I M I I I I I I I I I I I I IN I 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 



1741 



TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 



1740 
1740 
1800 



i i i I I I I I I II II I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I M I I M I I I I M I I I 
1741 TTACGGCATGGAAAGAAAATC^ 1800 



1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACAC^ACACT 

i I M I | | | | | | | | | I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I M 

1801 TTTTTACACTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAA^ 

M I I I I I I I I I I I I I I I I I I | | I I I I I I I I I I I I I I I I I I I M I I I M II I I I I I I I I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 
1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAA^ 

, , , || | | | | M | I | | | | || | | | | | | | | I I I I I I I I I I I I I I I I M I I I I I I II I I M I II 

1921 AATCAATGGGACTCTGA^ 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAA^ 

I I M I I I I I M I I I I I I I I M I I I I M I I I M I I I I I I I I I I I I I I I I I I I I M I Ml I I 
1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTT 

2041 TATCAC^CTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

I I I I I I I I I I I I I | | | | I I I I I II M I M M I M II II I II M M I II II M I I I I I I M 

2041 T iTCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 



2101 
2101 
2161 
2161 
2221 



TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

I | | | ■ , , | | , | | | | | | | | | | | M | || I I I M I I I M M I I I M I M M M II I I I I M II 

TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 



1860 

1860 

1920 

1920 

1980 

1980 

2040 

2040 

2100 

2100 

2160 

2160 



TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

I i | | | I I I I II I I I II II II I M I M I I I M I I I M I M I M I I I M M M I II M M II 

TTTTGAAAATCATTAC^ 2220 



C AAC AT GT C AC AAAC AAGCAGC AT GT AAC AGACT G G C ACAT GT GCC AGCT GAATT T AAAA. 



2280 



| i | | | | | | | | | | | I I I I I I II I I I II I M I M I I I M I M I I II M I I I I 

2221 CAAciTGTCACAAACAAGCAGCATGTAACAGACTGG 2280 
2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 2340 

,, | | | I | | || || | | | | I I I I I II I II I I I I I I I M M I I M I I M I II M M I I 

2281 tItaItACTTTTAAAAAGAAAATTAT 2340 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

, I . M i M I I I I I 1 I I I I I M I I I I I 11 1 I I I I I I I i I I I I I I I I I M I M I 1 M M I 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAG 2400 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

I I I M I I I I I I I I I I I I I I I I I I I I I IIMIIMIMMIIMMMMMMII 

2401 CATACCCTGTGAAGAC 2460 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA. 2520 

I I I I I I ! I I | | | 1 I I I M I I I M II I I I I I I 1 I I 1 M I I I I I I M I I 1 M I I I M I I I I 1 



2461 



TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 



2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 
2521 CTGCATG ,,,,,,,,,,,,,,,,,,, ,,,,,, | | | | ,, | | | | | | | || 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

2581 GCCACTGA n , , , | , , ,| | | M I I I I M II I I I II I M I I I 

G C CAGT GAC CT CAT AAT AAAGAC T GT GAACT GC CT GGT GC AGT GT C C AC AT GAC AAAGGG 



2581 
2641 
2641 
2701 
2701 
2761 



GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I M I I I I | | | | I i I I I II I I I I I I I M I I I I I I I I I I I I II I I I I I I 1 I 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 
GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

I I | | | | | | M I I I I I I I M I I I I I I I I I II I I I I I MINIMI": 

GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 



2520 
2580 
2580 
2640 
2640 
2700 
2700 
2760 
2760 
2820 



TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

M I I I I M I I I I I I I I M M I I M I I I I I I M I M I M I II I I I I M I I I I I N I I I I I I 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 



2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

I I I I 1 i I I t I 1 I I I I I II I II I I I I M II I I I M I I I I I I II II I I I II M II I I 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 
2 881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

MIIIIMIMIIIIMIIIIIIIIIIIMimMI ! ! I I I I I I ! M I I I I 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
2941 G G GAT GAG AT GT GT GT G AAAGT AT GT AC AAG AG AAAAC G G AAGAG AG AG GAAAT G AGGT G 

IMMIIMIIIIIMIIIIIIMMIIIMIIIIIMMIIIIIIIIMIMMIIIII 

2941 GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAACG G AAGAGAGAGGAAAT GAGGT G 



3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

I I I IN I | | M I I I I I I I I I I I I I MMI II I MM I I I Ml I II MMMI IIIIIIH 
3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

3120 



2880 
2880 
2940 
2940 
3000 
3000 
3060 



3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 
i i I I i i i i M I I I I I I I I M I I I I I II II I I II II I I I I I M I M I I I I 

3120 



I I I I I I M I M I I M M I I I I II M I I I I M I I I I I M I I I I I I I I I I I I I N I M I I I 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 



GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

I I I I | M M I I I I II M I I I I M II I I I I M I I I I I M M I I M I M I I I I M M I II I I 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 



3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

I I | || | | M II II I I I M II I I I I II II I II II I I I I M II II I I M 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 
3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 

M I M I II II I I I I II I I I I I M M I I I I I M I I I I II II I I M M I I I I I M II I I I M 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 
3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 3360 

M I I I II II I I II M M II I I II I I I I I I M II II I I M I I I II M I I I I I II M M I M 

3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 3360 



3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

I i I I I I I I M I I I I I I | | I I I I I I I I I M I I M I I I I I I I M I I I I I I I I I I I I I I M M 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 
3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA. 3480 

IMMI Mill MM I I II I I II III I'll MM II 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 
3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 

in 1 1 1 1 1 1 1 1 ii 1 1 1 ii 1 1 1 ii 1 1 1 ii 1 1 ii 1 1 1 mil mill 1 1 1 1 1 1 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 
3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

I I I | | I | | | | I I I I I I I II I I II I II M I I I M M I I M I M I Ml Mill MM 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 
3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

I I I I I I I I I I I I I I | | | I I I I I I I II M I II I I II II I M M I I I II II II I I I I 

3601 Uaaactacacacaaaaagcatacttgcattatttataataaaattgcattcagtggctt 



3480 
3540 
3540 
3600 
3600 
3660 
3660 
3720 



3661 tttaaaaaaaatgtttgattcaaaactttaacatactgataagtaagaaacaattataat 
I I MINIMI N M I II II N II I I I I N N I I I II I N I I I M M M II I I I I 

3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 3720 



3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 
| | I | | | | | | | | | | I I I I I I I I I I II II I I I I I II II I I I I II II I I I I II II II II I II I 
3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

T T CCT AGT AT T AAGGACT T T AAT AT AGC AACAGACAAAAT T ATT GT T AAC AT GGAT GT T A 

I I I I I | | | | | | | | I I I I I I I I I I I M II I N II I I N I I N I M II II II I M M 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 



3781 
3781 



3780 
3780 
3840 
3840 



^TCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 

II II I I I I II I I II II II II I I I M I Ml II I I l_M NN MM I IMMI IIIIINI ^ 



3841 CAGC 

3841 iicCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 



3960 



3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

I , I , . , I I II I I I I | | | | I I I M I I I I I I I I I I M M I I M I I I I I I I I M I I I I M I M 

3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 



3961 GTTTAT AGCAAAACAT GGGT AT GCT GT AGCT AACTTT AT AAAAGT GT AAT ATAACAAT GT 

M I I I I I II N I I II | N N I I I II I I I I I I I N II II I I II II I II I INI 

3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

I , M I I | I M I I I I I I I I M I I I I I I I I I I I I I I I I I I 1 I I 1 I I I I I I I I I I I I I 

4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 



4020 
4020 
4080 
4080 
4140 



4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

I I I I I I I I I I I I I I I I I I I I N I I I I II II II I N II I I I I I I I II I N I N I II 

4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTT ^140 
4141 ACT GT AC AG AC ACT AAT T CAT T AAAT ACT AAT T GAT T GT TT AAAAGAAAT AT AAAT GT GA 4200 

I I I I I I I | N | | II II II I I N I II II N I N II MMI Mil II I 

4141 ACT GT AC AGACACT AAT T CAT T AAAT ACT AAT T GAT T GT TT AAAAGAAAT AT AAAT GT GA 4200 



Qy 4201 CAAGT GGACATT ATTT AT GT TAAAT AT AC AAT TAT C^^GC^AGT AT ^^^^^TT ?^^TT 4260 

I I II I I M I i I I I M I M I I I I I M I I I I M I I M i I I I I M I I I II MM I I I' 426Q 



Db 4201 



CAAGT GGAC AT T ATT TAT GT TAAAT AT ACAAT TAT C AAGCAAGT AT GAAGT T ATT CAAT T 



Qv 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

Mill I I I I I I I I I I I 

Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 6 
ACC72646 

ID ACC72646 standard; cDNA; 4286 BP. 
XX 

AC ACC72646; 
XX 
DT 
XX 
DE 

S Human; cancer; diagnosis; screening; modulator; leukaemia; ischaemia; 
heart disease; atherosclerosis; endometriosis; gene; ss. 



XX 
PI 



09-JUL-2003 (first entry) 

Human endothelin receptor type B encoding cDNA. 



KW 
XX 

OS Homo sapiens. 
XX 

PN WO2003025138-A2. 
XX 

PD 27-MAR-2003. 
XX 

PF 17-SEP-2002; 2002WO-US02 9560 . 
XX 

PR 17-SEP-2001; 2001US-0323469P. 

PR 20-SEP-2001; 2001US-0323887P . 

PR 13-NOV-2001; 2001US-0350666P. 

PR 08-FEB-2002; 2002US-0355145P . 

PR 08-FEB-2002; 2002US-0355257P . 

PR 12-APR-2002; 2002US-0372246P . 
XX 

PA (EOSB-) EOS BIOTECHNOLOGY INC. 



Afar D, Aziz N, Gish KC, Hevezi PA, Mack DH, Wilson KE; 
PI Zlotnik A; 
XX 

DR WPI; 2003-354600/33. 
DR P-PSDB; ABR58526. 
XX 
PT 
PT 
PT 
XX 

PS Claim 8; Page 143-144; 767pp; English 
XX 

cc 



New genes that are up-regulated or down- regulated m cancers useful as 
xnarkers for diagnosing e.g. cancer, ischemia or heart disease , or as 
therapeutic targets for screening drugs for treating these diseases. 



The present invention describes an isolated nucleic add ^lecule, which 
CC comprises the sequence of any of the genes that are up-regulated or down 
CC regulated in specific cancers (e.g. about 1031 genes up-regulated in 
CC acute lymphocytic leukemia) . ACC72641 to ACC72860 represent cancer 
CC related gene nucleotide sequences which encode the proteins given in 
CC ABR58521 to ABR58709. Also described: (1) determining the presence or 



r »,n™lnr.i™l cell in a patient; (2) an expression vector 
CC absence of a P"hological cell in P above . (3) a host « u 

CC comprising *" ucle "> C ^. ted polypeptide, which is encoded by 

CC comprising an 4 i n ™ b ^° ^t specifically binds the polypeptide 

CC of Z (6 Self «lS targeting . corned to a pathological cell in a 

CC patient by administering to the patient the antibody J^'^J^ „ 

CC pathologies 

SQ Sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 
Query Match 99-6%; Score 4284.4; DB7; Length 4286; 

Best Local Similarity 100.0%, Pred. No 0; 

Matches 4285; Conservative 0; Mismatches 1, Indels u, P 

1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

61 AGGTAGGCATTTGCC^ 120 

121 AGGAT CAACACAGT GGCT GAACACTGGGAAGGAACT GGTACTTGGAGT CTGGAC AT CT GA 180 
Qy 121 AGGAT CAACACAGT GGO , , , , , , , , , , , , , , , | , , , , | M I I I 

121 A.GGAT CAACACAGT GGCT G AACACT G G GAAGG AACT GGT ACTT GGAGT CT GGAC AT CT GA 180 
181 ^CTTGGCTCTgIa^ 240 



60 
60 



U CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 



Db 

300 

241 cIgccgcctccIagtctgtgcggIcgcgccctggttgcgctggtt 300 

soi tcgcggatctggggagaggagagaggcttcccgcctgacagggccactccgcttttgcaa 360 
Qy 301 tcgcggatctgggga i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i I I I I I I I I I 

Db 301 tcgcggItctggggIgaggagagaggcttcccgcctgacagggccactccgcttttgcaa 360 

Qy 361 accgcagagataatgacgccacccactaagaccttatg^^ 420 

361 accgcagagataatgacgccacccactaagaccttatgg 420 

Qy 421 ctggcgcggtcgttggcacctgcggaggtgcct^ggagacaggacggca^ 480 
gy , I I i i I I I I I II i i I I I I I M I I I I I I I I I I M I I I I I I M I I I I I 1 I 

Db 421 CTGGCGCGGTCGTTGG^ 480 



Qv 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

Y l I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 

481 CcicGcicciTCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 



541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
541 TACATCAACACGGT ,,,,,,, ,,,,,, | , , , , | | | | | | | | I II I I 

541 TACATCAACACGGTTGTGTCCTGC 

601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGA^ 660 
i ii i i i I l l l I I I I I I M I II I I I I M I I M M M I I I I M I I I I M M I M M I 

601 CTTCTGAGAATTATCTA^ 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCT 720 

i i i i i i I l l l I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I M I I 

661 AGCTTGGCTCTGGGAGACCT^ 720 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA^ 780 

l I I I I . I . i I M I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I Ml 

721 CTGCTGGCAGAGGACTGGCCA^ 
781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATOT B40 

i i i I l M I I I I I I I I I I M I I I I I M I I I I M I I I I N I I 1 1 11 1 1 1 1 1 11 11 11 " 111 OAn 

781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGA 840 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCA 900 

i i i i i i i I l l I I I I I II I I I I I M I I I I I I I I I I I I I I I I 1 M I II I I I I I I I I I I I 

841 GCTGTTGCTTCTTGGAGT 900 

960 

901 iUGiiiiGliiiGGGiGGicicTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



780 



ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATC^ 1020 
I | I I | | M I I I I I I I I I M I I I I I I I I M I M I I I I I I I I I I II 
961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 



i n 1 1 1 n 1 1 M i m i.]iiii::i::;:^;;;;;;^; T ^ TTr . ATC ccGTTCAG 1020 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTG 1080 

mi i i I I I I I I I | I | | | | M I I I M I I M I I M I I I I I I I II I M I I I I I I I M 

1021 AAGACAGCTTTCATGCAGTTTTA 1080 
1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 
1081 TTCTGCTTGCCA ,,,,,,,,,,,,,,,,,,,,,, | | | | | | | | | | | | | | I I I 

1141 AGAAAGAAAAGT GGC AT GC AGAT T GCT T T AAAT GAT C AC ^^^^^^^^ < ????^^?T *T 12 °° 

Vn i i i i i i I l l l I I I I M I I I M I I I I I I M I I I I I I I I I I I I M I I I I M I I I I I I I I 

1141 AGAAAGAAAAGTGGCATGCAGATTG^ 1200 

1 901 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 
1201 SCCAAAACCGTCTT ,,,,,,,,,,,,,,,, | | | ,,, | | | | | | | | | | | | 

120! GCCAAAACCGTCTTTTGCCT 1260 
1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACT 1320 

Vn i i i I I 1 I II I I M I I I I I I I M I I I I I I I I M M I I I I M M I I I I I I I I I I I I I I I 
1261 AGCAGGATTCTGAAGCTCACTCTTTAT 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 
1321 AGCTTTCTbl , , , , , , , , , , , , , , | , , , , , | || , | | | | | | | II I 

1321 AGCTTTCTGTTGGTATTGGACTATAT^ 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 



1381 1440 

1441 UiUiiiiiGC^icAiTiGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCT^ 1500 

Tui T i???m iTTiTTiiTTTTTiVi TTiTm i i 1 1 1 1 1 1 1 1 mi mi mi i 1 1 1 1 ^ 



1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATA^ 1560 
1501 AAGTTCAAAGCTAATGATCACGGATA "60 



1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTC 1620 

l I I I M I I I II I M I M M I M I M I I I I M I I M M II I I M M I I I I M M I M M I I 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATAT 1620 

CAAAAC AAAAC AAAAAACT AT GT AT T T GCAC AGCACACT AT 168 0 



1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 
1681 TAAAATATTAA i ^^^^^ " I' !i!'iililUiliiiiUJJ.iU™UiiiCT 1740 
1681 



1741 



1741 



TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTT 
TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCA^ 1800 



1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

;::: e-^^ «« 

1861 ™a_ ? ™^^ 21 

TAGGCTTW\AAATGAGCTCACTCAGAATTTCTATTCTTTCTA^ 1920 



1861 



1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTT^ ™° 



1921 



1981 



AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 



2040 

19<1 sniiiiiS *>« 

?041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

e^^^ »« 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 
2101 ^CGGACAC^^ 



2220 



2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGT 2220 

i ii i i i i i I M I I I I I I I I I M II I I I I M I I M I I M M M II I I I M II I I M II I I I 

2161 TTTTGAAAATCATTACACTTTCACTA 
2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 

2221 ^™ 1|l|llllllllll ||||| 1 i,iiiiiiiiiiiiiiiniiiiiiiiiiii 



2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 



2281 TATAATACTTTT. 



1 AAAAAGAAAAT TAT T AC AT C CTTT AC AT T C AGTT AAGAT CAAAC CT CA 2340 

M I i I I I I I | I M | | | | I I I I I M I M II I I I I I I I I I I I I I I I I N M I I I 

2281 TATAATACTTTTAAAAAGAAAATTATTACA 2340 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

I III I | | | | | | I I 1 I I I I I 1 I I I I I I I I I 1 I 1 I I I ■ * 1 1 1 1 1 1 1 1 ' ' 0J nn 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

9401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 
2401 CATACCCTGbAA ,,,,,,, ,,,,,,, | ! | | | | | | | | | | 

2401 CATACCCTGTGAAGA^ 2460 

9 4 61 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 
2461 TCACTATCGTAGO , , , , , , , | , , , , | , , , | | | | | | 



2461 



| | | | | | | | | | | I I I I I I I I I I I I I I I I M II M I I I I I I I I M I I M i i 

TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTA 2520 



2580 



2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

Ml, iiiiiiMIIIIIIIIIMMIIIMIIMIIIIIIIIIIMIII 

2521 CTGCATGTAGATGAT^ 2580 

2640 

2640 

2700 

2700 

2760 



2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 
2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATA 

MIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIMIIIIIIIIIIIMIIIIIMM 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2701 



2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACA^ 

i ii i i i i i i i I I l I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I M I I I 

GCTATAGTTAAAATACTATTTTTC 2760 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGT^ 

I I I I I I I i I I I | | I I I I M II I I M I I I I I I I I I I I I II II I I I I I I I I M I I I I I I M I 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

?821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
I I I I I I I I I I I I I I I M I I I I I I M I I I M I M I I I I I I II I I I I I M I I I I I I I I I I I I 
2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 



2820 
2820 
2880 
2880 
2940 
2940 
3000 



GGGAT GAG AT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAG AGGAAAT GAGGT G 



2941 G G GAT GAG AT GT GT GT GAAAGT AT GT AC AAGAGAAAAC G GAAG AG AG AG GAAAT GAGGT G 

| | | | | | | M II I I I I I I I I I I I M liiiiii!ilIii^iii^^^^iii T iiGGTG 3000 
2941 " """">"»'>"^-"^- 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

i I I M I I I I I I I M I I I M I I I I I I M I I I I M I I I M I M I I I M I 

300! GGGTTGGAGGAAACCCATGGGGACAGAT^ 3060 
3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 

i l l l I I I M I I I I M I I I II I I I I I I I I I M I I I I I M II I I I I II I I I I I I M I I I I I I 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTT 3120 



3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 
3181 



3180 



ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

IMM MINIM MINIMI MINIMI MM MINI I I I I I I Ml II I I ^ 



3241 



3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

nmmnnn 1 1 1 1 1 1 1 1 1 1 1 1 1 > 1 1 1 1 1 1 LIlLilLLLIIIIIIIIIIII 1 

;ag 

3181 aUtttactttgtttttcttttaataggctgggccacatgttgga^ 

TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 
IMM MINIMI MNMNMINNNMMNNMIMIMNMIMNIN I 
3241 TTGTTTTCTGTCAATATTGAA^ 3300 

3301 GAAAGAAAGAGCAAT AAT AATT AAT T CACAC AC CAT AT GGAT T CT ATTT ATAAAT CAC C C 3360 

I I I | in Ml I I MINN I I I MM M M I I M I I I I II I N 

3301 GAAAGA^GAGCAAT AAT AATT AAT T CACAC AC CAT AT GGAT T CT ATTT AT AAAT CAC C C 3360 

3 361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

3361 ACAAACTTGTTC ,| I II I N I M II II II I M N I I I I N I I I I 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 



| m I M M | | | || || | | | II I M II M I I I I I II I I I M N M I I I N N I ... ■ 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 
III | | | | M |l I II M I I I I I I M I II I II I I I I I I N I N I I I I I I I I M I I I I I Ml' 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 
3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

I , | | , , , , | , | | | | | m I I II I M M II I I I I M II I II I I N I II I I I N N II I IN 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 
3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTC 

II II N N I N N I I N II II II I N II II I II I N I I I N I II I I I M M N II I N II 

3601 T GAAACT AC AC AC AAAAAGCAT ACT T GCAT TAT T T AT AAT AAAATT GC AT T C AGT GG CT T 
3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 

I | | | | M I,, | | | | M | | | | | N II II I II I I M II I I I N I N I I I N N N N INN 

3661 tTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTG 

3 7 ? 1 T T CT T T ACAT ACT C AAAACCAAG AT AGAAAAAGGT GCT AT C GTT CAACT T C AAAAC AT GT 
3721 TTCTTTACATACT ,,,,,,,,,,,,,,, | | | , | || I I I N N N I N N N 

3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 



3480 
3540 
3540 
3600 
3600 
3660 
3660 
3720 
3720 
3780 
3780 
3840 



3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATCTTA 

3840 



I , | , M | , | t , II II N II I I N II II II I I N I II I II II II I I I I N N I II I 

3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 
3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 



I I I 1 1 I I I I I I 



I || M | I II I N I M I I M I I I I I I • ' 1 I I I I I 



3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 
3901 



3900 



3901 



GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 

,m | | | | | | || | || | N I N I N I II I I I II I N I I II I II II I N II I I I N N 

GT G GAT GT AT GTT C AAAC AC CT T T T AGT AT T G AT AGCT T AC AT AT GG C C AAAGGAAT AC A 3960 



4020 



Qy 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTAT "20 

Db 3961 GTTTATAGCAAAACATGGGTATGC^ 

Qy 4021 AAAAAATTATATATCTGGGAGGATTTTTT^ 4080 

Db 402 mm™mn,m^r.n,n./=r.r.maaar: 



1 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 



Oy 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTT^ 4140 

QY | | , I I I | | | | || M | | | M I I I I I I M I I 1 I I I I I I I I M I I I I I I I I I I I I I I I 

Qy 4141 ACT GT ACAGACACTAATT CATT AAATACTAATT GATT GTTTAAAAGAAAT GT GA. 

vy ......it i i i ii i I I i I I I I M I I I I I I I I I I I M I M M I I I I I I I I I 



Db 408] 

4200 

VTT I mTmY I I I I M I I I M II I M M I I I I I I I I I I I I I I I N I M I I M I I I 1 I I l l 
4141 iciiii^ 4200 



oy 4201 CAAGT GGACAT TAT T TAT GT T AAAT AT ACAAT TAT CAAGCAAGT AT GAAGT TAT T 4260 

QY I Mill III I II I I III IN M I III II II I M I I II I MM I I Mill MM III I II I 

Db 4201 CtJvGT GGACATT ATTTAT GTT AAAT AT ACAAT TAT CAAGCAAGT AT GAAGTT ATT CAATT 



4260 



Ov 4261 AAAATGCCACATTTCTGGTCTCTGGG 428 6 

| i | I I I II I I I M I I I M M I M I I I 
Db 42 61 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 7 
ABZ42661 

ID ABZ42661 standard; DNA; 4286 BP, 
XX 

AC ABZ42661; 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
KW 



04-MAR-2003 (first entry) 

Human endothelin B receptor nucleotide SEQ ID NO: 113. 

r nrotein-coupled receptor; GPCR; antigenic peptide; gene therapy; 
G protein-copied receptor modulator; antibody; 

growth-related disease; cell regeneration-related disease; AIDS, cancer, 
^unological-related cell proliferative disease; 

Alzheimer's disease; atherosclerosis; infection; ? St ^"^^ t ^; b ^e" 
osteoporosis; cardiomyopathy; inflammation; Crohn s disea ^^"; in . 
graft versus host disease; Parkinson- s disease; multiple ^"° S ^; s Pa1 ^ 
KW psoriasis; anxiety; depression; schizophrenia; demen ^' ^ 
KW mental retardation; epilepsy; asthma; tuberculosis ; ob esity , »^sea . 
S hypertension; hypotension; renal disorder; rheumatoid arthritis; trauma, 
KW ulcer; gene; ds . 
XX 

OS Homo sapiens. 
XX 

PN WO200261087-A2. 
XX 

PD 08-AUG-2002. 
XX 

PF 19-DEC-2001; 2001WO-US050107 . 
XX 

PR 19-DEC-2000; 2000US-0257144P. 
XX 



PA (LIFE-) LIFESPAN BIOSCIENCES INC. 
XX 

PI Burmer Gc, Roush CL, Brown JP; 
XX 

DR WPI; 2003-046718/04. 

DR P-PSDB; ABP81815. 
XX 

PT New isolated antigenic peptides e.g., for G protein-coupled receptors 

PT (GPCR), useful for diagnosing and designing drugs for treating conditions 

PT in which GPCRs are involved, e.g. AIDS, Alzheimer's disease, cancer or 

PT autoimmune diseases. 

XX 

PS Disclosure; Fig 1; 523pp; English. 
XX 

CC The present invention describes antigenic peptides (I) comprising: (a) 

CC any one of 1601 sequences (see ABP82019 to ABP83619) of 12-24 amino 

CC acids. Also described: (1) an assay for the detection of a particular G 

CC protein-coupled receptor (GPCR) or a candidate polypeptide in a sample; 

CC and (2) an isolated antibody having high specificity and high affinity or 

CC avidity for a particular GPCR. (I) can be used as GPCR modulators and in 

CC gene therapy. The antigenic peptides for GPCRs are useful in detecting an 

CC antibody against a particular GPCR, and in the production of specific 

CC antibodies. The peptides and antibodies are also useful for detecting the 

CC presence or absence of corresponding GPCRs. The antigenic peptides for 

CC GPCRs and antibodies are useful for diagnosing and designing drugs for 

CC treating immune- related diseases, growth-related diseases, cell 

CC regeneration-related disease, immunological-related cell proliferative 

CC diseases, or autoimmune diseases, e.g. AIDS, Alzheimer 1 s disease, 

CC atherosclerosis, bacterial, fungal, protozoan or viral infections, 

CC osteoarthritis, osteoporosis, cancer, cardiomyopathy, chronic and acute 

CC inflammation, allergies, Crohn's disease, diabetes, graft versus host 

CC disease, Parkinson 1 s disease, multiple sclerosis, pain, psoriasis, 

CC anxiety, depression, schizophrenia, dementia, mental retardation, memory 

CC loss, epilepsy, asthma, tuberculosis, obesity, nausea, hypertension, 

CC hypotension, renal disorders, rheumatoid arthritis, trauma, ulcers, or 

CC any other disorder in which GPCRs are involved. The antibodies may be 

CC used in immunoassays and immunodiagnosis . ABZ42523 to ABZ42869 encode 

CC GPCR proteins given in ABP81675 to ABP82018, which are used in the 

CC exemplification of the present invention 

XX 

SQ Sequence 4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 



Query Match 99.6%; Score 4284.4; DB 7; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 1 I I I M I I I I 
Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 

I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 



Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I II I I I I I I I II I I I I | I I I 
Db 181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I I M II I I I I I I I I || | | | | | | | | | || | | | I I I I M I I II 

Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

M I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I | I | | | | | || | | | | | | | | | 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

I I I I I I M I I M I I I I I I I I I I I II I I I I I I || | | | | | | | | | | | | | || | | | | | | | | | M | 
Db 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 42 0 

QY 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

I I I I I I I I I I I I I I M I II I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | M || 
Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

Qy 4 81 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

I I I I I I I I I I I I I I I I II II II I I I I I I I I II I I I I I I II I I II I I I I I I I II I I | | | | | 
Db 481 C CAC GC AC C AT CT CCCCTCCCCC GT GC CAAGGAC C CAT C GAGAT CAAG GAGACT T T CAAA 540 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I M I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

Qy 601 CTT CT GAGAAT TAT CT ACAAGAACAAGT GCAT GC GAAAC GGT C C CAAT AT CTT GAT CG C C 660 

I I I I M I I I M M I M I II I I I I I I I I I I I I I I I II I I I || I I || I M | | | | | | | | | | | | 
Db 601 CTT CT GAGAATTAT CTACAAGAACAAGT GCATGCGAAAC GGTCC CAATAT CTTGATCGCC 660 

Qy 661 AGCT T G GC T CT GGGAGAC CT GCT G CACAT C GT CAT T GAC AT C C CT AT CAAT GT C T ACAAG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I || I I I I I I I | I | I | | | 
Db 661 AGCTTGGCTCTGGGAGACCT GCT GCACATCGTCATTGACATCCCTAT CAAT GTCT ACAAG 720 

Qy 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 78 0 

I I I I I I M I I I I I I I I M I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 78 0 

Qy 781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 84 0 

M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I | | | | | | | | | | | | | | || | | | | | | | | | | | | 
Db 781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 84 0 

Qy 841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I II II 
Db 841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

Qy 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I M I I 
Db 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

Qy 961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | I | I I I I 
Db 961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 



1021 
1021 
1081 



AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

I I I I I I | I | | | I M I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I M 

AAGACAGCTTTCATGCAGTTTTAC7^AGACAGC7KAAAGATTGGTGGCTGTTCAGTTTCTAT 



1080 
1080 
1140 



TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

| | | | M I I I II M I I I II I I I I M I I I I I I I I M I I I I 

1081 TTCTGCTT GC C AT T G G C CAT C ACT G CAT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T G 1140 



1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

I | | | | | | I M II ! I I I I II I I I I M I I I I I I I M I I I M I I I I II I I I M 

1141 AGAAAGAAAAGT G G CAT GC AG AT T GCT T T AAAT GAT C AC CT AAAGC AGAGAC GGGAAGT G 



1200 
1200 



1201 GCCTW^ACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

I M | | | | | | | | | | I M I I I II I M I I I I I I I M I M I I I I M I II I I I I I I I M I M I I I 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



1201 
1261 
1261 



AGC AGGAT T CT GAAGCT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 

I | M I I I I II I I I I I I I I I I I I I I I N I I I I I II I I I I I I I I I I I I I M I I I I II I M I I 

AGC AGGAT T CT GAAGCT C ACT CTT T AT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 



1260 
1320 
1320 



1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

| | | | M | | | M I I M I II I I I I II I I I I I I M I I M I I I I I I I I I M I I 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

I I I M I I I I I I I I M I I M I I I I I I I I I I I M I I I M I I I I I I M I I I I I II M I II I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1440 
1440 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| | | | | M I I I I I I I I I I I I I I I I M I I I II I I I M I I I I II I I I I I I M I 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 



1441 

1501 AAGTT CAAAGCTAAT GAT CACGGAT AT GACAACTT CCGTT CCAGT AAT AAAT ACAGCT CA 

|| | | | | | | | | | I M I I I M I I I I I I I I I I I I I I I I I I I M I I M M I I I M I I II 

1501 AAGTT CAAAGCTAAT GAT CACGGAT AT GACAACTT CCGTT CCAGT AAT AAAT ACAGCT CA 

1561 T CT T GAAAGAAGAACT ATT C AC T GT AT T T CAT T TT CT TT AT AT T G GAC C GAAGT CAT T AA 

| | | | | | | | | II M I I I II II I I I M I I I I I I M I I I M I I I M I I 

1561 T CT T GAAAGAAGAACT AT T C ACT GT AT T T CAT T T T CT T TAT AT T GGAC C GAAGT CATTAA 

1621 AAC AAAAT G AAAC AT T T G C CAAAAC AAAAC AAAAAAC T AT GT AT T T G C AC AG C AC AC TAT 

I I | | | | I II I I I I I M I M I I II I I I I I I II I I I I I M II M M I I I I I I I I I 

1621 AAC AAAAT GAAAC AT T T G C CAAAAC AAAAC AAAAAACT AT GT AT T T G C AC AGC AC ACT AT 



1500 
1560 
1560 
1620 
1620 
1680 
1680 



1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 17 4 0 

MIM I I I I I I I I I I II I I I I I I I I M I M I I I I I I I I I I I II I I I I 

1681 T AAAAT AT T AAGT GT AAT TAT T T T AAC ACT CACAGCT AC AT AT GAC ATT T TAT GAGCT GT 



1740 



1741 T T AC GGC AT GGAAAGAAAAT C AGT G G GAAT T AAGAAAGC C T C GT C GT GAAAGCACT T AAT 1800 

| | | | | M I I I I I I | | I I I I I I I I II I M I I I I I I I I I I M I I I I I I I I I I I I I I 

1741 T T AC GG CAT G GAAAGAAAAT C AGT G G GAAT T AAGAAAG C CT C GT C GT GAAAGCACT T AAT 1800 



1801 T T T T T ACAGT TAG CACT T CAACAT AGCT CT T AAC AACT T C C AGGAT AT T C ACAC AAC AC T 1860 

| | | | I I I I I M I I M M I M I I I I I I I M I I I I M I I I I I I I I I I I I I I I M I I I I M M 

1801 T T T T T ACAGT T AGCACT T CAACAT AG CT CT T AACAACT T C CAG GAT AT T CAC AC AAC ACT 



1860 



1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 



HIM M I I I I I M I > I I M I I I I I M II I I I 1 I I I I 1 I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

1921 AAT C AAT G G GAC T CT G AT AT AAAG GAAGAAT AAGT C ACT GT AAAAC AGAACT T T T AAAT G 

| 1 | | | | | | | | | I I I I M I I I M I I I I II I I 

1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 

1981 AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CTT T AAAACAACT T T T CAAT T AAT AT 

| | | | | | | | | | M I I I I I I M I I I M I I I I M I I I I I I I I I I I I 1 I 

AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CTT T AAAACAACT T T T CAAT T AAT AT 



1981 



1920 
1980 
1980 
2040 
2040 



ACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 



2 041 TATC 

I I I iVlVllVl II I I I M I M I I I I I I I M I I I I I I M I I I M M I I I I M I 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

2101 TT T T C GGAC ACT GGAAAC AT T T AAAT GAT CAGGAGG GAGT AAC AGAAAGAGCAAGGCT GT 

| | | M II I I II M I I I M I I I I I M M I I I I I I M I I I I II I I I I I M I M M I I 

2101 TT T T C G GAC ACT G GAAACAT T T AAAT GAT CAGGAGGGAGT AAC AGAAAGAGCAAGGCT GT 

2161 T T T T GAAAAT CAT T ACACT T T C ACT AGAAGC C C AAAC CT C AGC AT T CT GCAAT AT GT AAC 

I m || | || | | | || I I I I I I I II I I I M I I II I I I I I M I M I II I I I I I I I I I M I I I I I 

T T T T GAAAAT CAT T ACACT T T C ACT AGAAGC CCAAAC CT C AGC ATT C T GCAAT AT GT AAC 



2161 
2221 
2221 
2281 
2281 
2341 



CAAC AT GT CACAAACAAGCAGCAT GT AACAGACT GGCACAT GT GCCAGCT GAATTT AAAA 

I I I | | | I I I M II M M I I I I I M II I I I II I I I I I I I I M M M I M M I M 

CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 
TAT AAT ACT T T TAAAAAG AAAAT TAT T AC AT C CT T T AC AT T C AGT T AAGAT C AAAC CT C A 

I | | | | | M | | | | M I I I II I I I I I I M I I I I II I I I I I I I I I I I I I I I I I M I M 

TAT AAT ACT T T T AAAAAGAAAATT AT T AC AT C C T T T AC AT T C AGT T AAGAT C AAAC CT C A 



CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACT T T T TT GAAT CT GT CAT T C A 

I | M | | | | I I I M M II I I I I I I I M I II I M I I M I I I I M I I II M II I I I I I I I I II 

2341 CAAAGAGAAAT AGAAT GTT T GAAAGG CT AT C C CAAAAGACT TT T T T GAAT CT GT C ATT CA 



2401 CAT AC C CT GT GAAGACAAT ACT AT C T ACAAT T T T TT CAGGAT TAT T AAAAT CTTCTTTTT 

I | | || | | || | M I I II I I II I I I II II I I I I M II I II II MMIIMMMI 

CAT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T CAGGAT TAT T AAAAT CTTCTTTTT 



2401 
2461 



TCAC TAT CGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT ACTTACCTACATACA 

I | | | | | | | | | | M I I I M II I M I I M I I I I I I I I I I I I I I I 11 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAAT ACTTACCTACATACA 



2521 CT GC AT GT AGAT GAT TAAAT GAGG GC AGGC C CT GT GCT C AT AGCT T T AC GAT G GAGAGAT 

! | | | M | I I M | M I II I II I M I I I I I I M I I I I I I I I I I I M I I I I I I I I M I I M I I 

2521 CT G CAT GT AGAT GAT TAAAT G AGGGCAG GCCCTGTGCT C AT AG CT T T AC GAT GGAGAGAT 
2581 GCCAGT GACCT CATAATAAAGACT GT GAACT GCCTGGT GCAGT GT CCACAT GACAAAGGG 

I | | | | | | | | || I I I I II I I I I I I M I I M I I II I I I I I I I M M I I I I M I I 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACAT GACAAAGGG 



2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I I | | | | M I I I I I I I M | I I I I II M I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I M 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2100 

2160 

2160 

2220 

2220 

2280 

2280 

2340 

2340 

2400 

2400 

2460 

2460 

2520 

2520 

2580 

2580 

2640 

2640 

2700 

2700 



2701 GCTATAGTTTW^TACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 2760 

|| || | I I I I I || I I II II II I I I I I I M I I I II II I I I M M I I II II I I I I I I I 



27 01 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 27 60 

27 61 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

| | | | | | | | | I I I I M M I I II I I I I I I I I I I I I I I I I I I I I I I I M I 1 I I I M I M I I I I 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGA7WVGTTTGCTTG 2820 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 

| | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I II I I M I M I I I I I I I I I II I I I I I 
2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 28 8 0 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 

|| | | | | | I I I I I I I I I II I M II I I II I I I I I I I I I I I I I I I II I I I 

28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 294 0 

2941 GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 3000 

| | M | | | || | | I I I I I I I I I M II I I I II M I I I M I I I I I I I M I I I I I I I I I I I I I I I 
2941 GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 3000 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

| M | || M | | I I I I I I I I I I I I I II I I I I I I I I M I I I I I M I I I II I I II I I I I I I I I I 
3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

3061 C GT CAC AT CAAT GCAAAAG GT C CT GAT T T T GT T C CAGCAAAAC AC AGT GCAAT GTT CT C A 3120 

|| | M | | M I M I I I I i I I I I M I I I I M I I M I M M Ill I 

3061 C GT CAC AT CAAT GCAAAAG GT C CT GAT T T T GT T C CAGCAAAAC AC AGT G CAAT GTT CT CA 3120 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTT^CTCGGTCTTAAAATATGCCCAA 318 0 

|| | | | | | | | | | I I I I I I I II I I I I I I I I i I I I I I I I I I I I M I I I M I I I I I I I I I I I I I 
3121 GAGT GAC T T T CGAAATAAAT T G G G C C CAAGAGCT T T AACT C G GT CT T AAAAT AT GC C CAA 318 0 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGG7\AATAAGCTAGTAATG 324 0 

| | | | | I I I I I I I I I I I I I I I M I I I M I I I I I I M I I I I I I I I I I I I I I I M I I I II II I 
3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGT AAAC CAAAACC CAACAAT GT GGC CA 3300 

| | || | | | | || I I II II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I II I 
3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGT AAAC CAAAACC CAACAAT GT GGC CA 3300 

3301 GAAAGAAAGAGC AAT AAT AAT T AAT T CAC AC AC CAT AT G GAT T CT AT T T AT AAAT CAC C C 33 60 

| | | | | | I M I I I I I I I I I I M I I I I I I I M I I I M I I II I I I I I I I I I I I 

3301 GAAAGAAAGAGC AAT AAT AAT T AAT T CAC AC AC CAT AT G GAT T CT AT TT AT AAAT CAC C C 33 60 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T CAGAG GC CT GT TAT C AT AGAAGT 3420 

Ml I I I I I I I I M I I I I I I I I I I I I I II I I I I I M I I I I I 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

3421 CAT T T T AGACT CT CAAT T T T AAAT TAAT T T T GAAT C ACT AAT AT T T T CAC AGT T TAT T AA 3480 

|| | | | | | | || I I I I I II I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
3421 CAT T T T AGACT CT CAAT T T T AAAT TAAT T T T GAAT C ACT AAT ATT T T CAC AGT TT AT T AA 34 80 

34 81 TAT AT T TAAT T T CT AT T T AAAT T T TAG AT TAT T T TT AT T AC CAT GT ACT GAAT T T T T AC A 3540 

| | M I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
3481 TAT AT T TAAT T T CT AT T T AAAT T T TAG AT TAT T T TT AT T AC CAT GT ACT GAAT T T T T AC A 3540 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

I | | I | | | | I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I M 
3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 



Qy 3601 T GAAACT AC ACAC AAAAAG CAT ACT T GC AT TAT T T ATAATAAAAT T G CAT T C AGT GG CT T 3660 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3601 T GAAACT AC ACACAAAAAGCAT ACT T GCAT TAT T T AT AATAAAAT T G CAT T C AGT GG CT T 3660 

Qy 3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AAC AT ACT GAT AAGT AAGAAAC AAT T AT AAT 3720 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I M I I I I I I I I I I I 1 I I I I I I I I I M I I I I I I I 
Db 3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AAC AT ACT GAT AAGT AAGAAAC AAT TAT AAT 3720 

Qy 3721 T T CT TT ACAT AC T CAAAAC C AAGAT AGAAAAAGGT GCT AT C GT T CAAC T T CAAAAC AT GT 3780 

I I I I I I I I I I I I I I I I I I I 1 I I II II I 1 I I I I I I I I I I I I I I I I I I I II I M M I I I I I I 
Db 3721 T T C T TT ACAT ACTCAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAAC T T CAAAAC AT GT 37 80 

Qy 37 81 T T C CT AGT AT T AAGGACT T T AAT AT AGCAAC AGACAAAAT TAT T GT T AAC AT GGAT GTT A 3840 

I I I I I II I I I I I I I I I I I I I I I I I M I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 37 81 T T C C T AGT AT T AAGGACT T T AAT AT AGCAACAGACAAAATT AT T GT T AAC AT GGAT GTT A 38 4 0 

Qy 38 41 C AGCT CAAAAGATT T AT AAAAGAT T T TAAC CT AT T T T CT C C CT T AT TAT C CACT GCTAAT 3900 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I M I I M I I I I I I I I I I I 
Db 38 41 CAGCT CAAAAGATT TAT AAAAGAT T T TAAC C T AT TT T CT C C CT T AT TAT C CAC T GCT AAT 3900 

Qy 3901 GT G GAT GT AT GTT CAAAC AC CT T T T AGT AT T GAT AGCT T ACAT AT GG C CAAAG GAAT AC A 3960 

I I I I I M I II I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3901 GT GGAT GT AT GT TCAAAC AC CT T T T AGT AT T GAT AGCT TAC AT AT GG C CAAAGGAAT ACA 3960 

Qy 3961 GTT T AT AGCAAAACAT GG GT AT GCT GT AGC T AACT T T ATAAAAGT GT AAT AT AACAAT GT 4020 

I I I II I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3961 GTTTATAGCAAAACATGGGTATGCT GT AGCT AACT TT ATAAAAGT GT AAT AT AACAAT GT 4 020 

Qy 4 021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 4080 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 408 0 

Qy 4081 T T TAT TAT GT AAGCAAAAC CAATAAAAAT T T AAGT T TT T T TAAC AACT AC CT T ATT TT T C 4140 

I I I I I I I I I I I II I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I M I I II I I I I I I I I I 
Db 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 4140 

Qy 4141 ACT GTACAGACACTAATT CATTAAATACTAATT GATT GTTTAAAAGAAATATAAATGT GA 4200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 4141 ACT GTACAGACACTAATT CATTAAATACTAATT GATT GTTTAAAAGAAAT ATAAAT GTGA 4200 

Qy 4201 C AAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAG C AAGT AT GAAGTT AT T C AAT T 4260 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 4201 CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 4260 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 428 6 

I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 
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Qy 1 GAGAC AT T C C G GT GG GG GAC T C T G GC CAGC C C GAG CAAC GT G GAT C CT GAGAGCACT C C C 60 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II M I I I I I I I I I I I I I I 
Db 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCT GAGAGCACT CCC 60 

Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

M I I I I I I I I I I II I I t I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I 
Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AG GAT CAACACAGT G GCT GAACACT G G GAAGGAACT GGTACT T G GAGT C T GGACAT C T GA 180 

I I I I I I II I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I 
Db 121 AG GAT CAACACAGT G GCT GAACACT G G GAAGGAACT GGT ACT T G GAGT C T GGACAT C T GA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I II I I I II I I I I I I I I I I I 
Db 181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I 
Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 3 60 

Qy 361 AC C GC AGAGAT AAT GAC GC CAC C CAC TAAGAC CT T AT G G C C CAAGGGT T C CAACGC C AGT 420 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 361 AC C GC AGAGAT AAT GAC GC CAC C CAC TAAGAC CT TAT GG C C CAAGGGT T C CAACGC C AGT 420 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

I I I I II I I I I I I II I I I I I II II I I I I I I I I I I I I I II I 1 I I I I I I I M I I I I I I II I I I 
Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 

Qy 481 C CAC GC AC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGAC TT T C AAA 54 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 C CAC GC AC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGAC TT T C AAA 54 0 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

Qy 601 CT T CT G AGAAT TAT C T ACAAGAAC AAGT G CAT GC GAAAC G GT C C CAAT AT CT T GAT C GC C 660 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 601 CT T CT GAGAAT TAT C TACAAGAACAAGT G CAT GC GAAAC G GT C C CAAT AT CT T GAT C GC C 660 

Qy 661 AGCT T GG C T CT GGGAGAC CT GCT GC AC AT C GT CAT T GACAT C C CT AT CAAT GT CT ACAAG 720 

I I I I II I II I I I I I I I I I I II I I II I II II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 661 AGCTT G G CT CT G GGAGAC CT G CT GC AC AT C GT CAT T GACAT CCC TAT CAAT GT CT AC AAG 720 

Qy 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I 
Db 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 

Qy 7 81 AAAGC CT C C GT GGGAAT CACT GT GC T GAGT C TAT GT G C T C T GAGTAT T GACAGAT AT C GA 84 0 

I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II 
Db 781 AAAGC C T C C GT G G GAAT CAC T GT GCT GAGT CT AT GT GC T CT GAGTAT T GACAGAT AT C GA 84 0 



Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 


Qy 


1021 


Db 


1021 


Qy 


1081 


Db 


1081 


Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 


Qy 


1261 


Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 


Qy 


1681 



GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I M I I I I I 
GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

ATAAT T AC GAT GGACT ACAAAG GAAGT TAT C T G C GAAT CT GCT T GC TT CAT C C C GT T C AG 1020 

I I I I I I I I I I I I! I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATAAT T AC GAT G GACT ACAAAGGAAGT T AT CT G C GAAT CTGCTTGCTT CAT C C C GT T C AG 1020 
AAGAC AG CT T T CAT GC AGT T T T ACAAGACAG CAAAAGAT TGGTGGCTGTT C AGT T T CT AT 1080 

I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGAC AGCT T T CAT GC AGT T T T ACAAG ACAGC AAAAGAT TGGTGGCTGTT C AGT T T CT AT 108 0 

TTCTGCTT GC C AT T GGC CAT C ACT GC AT T T T T T TAT AC AC TAAT GAC CT GT GAAAT GT T G 114 0 
I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I II I I I I I I II II I 
TTCTGCTT GC C AT T GGC CAT C ACT GC AT T T T T T T AT AC ACT AAT GAC CT GT GAAAT GT T G 1140 

AGAAAGAAAAGT GGCAT GC AGAT T GCT T TAAAT GAT CAC C T AAAGCAGAGAC G G GAAGT G 12 00 

I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I M M I I I I I I I I I I I I M I I I I I M 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 12 00 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 12 60 

I I I I I M I I I I I I I I I I I I I I I II I I I M I I I I I I II I M I I M I I I I M I I I I I I I I I I 
GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

AG CAGGAT T C T GAAGCT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 1320 

I | | | | I I I I I I I I I II I I I I I I II I I I M I I I I I I I I I M I I I I I I I I I M I I I I I I I I I 

AGCAGGATT CT GAAGCT CACT CTT T ATAAT CAGAAT GAT CCCAAT AGAT GT GAACTTTT G 132 0 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I 
AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

I | || I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 144 0 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I | | | | I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

AAGT T CAAAGCT AAT GAT CAC G GAT AT GAC AACT T C C GT T C CAGTAAT AAAT AC AG C T C A 1560 

I | | I I I I I M I I I I I I II I I II I I I I I I I I I I I I I I I I I II I II I I I I I I M II I I I I I I 

AAGT T CAAAGC T AAT GAT CAC GGAT AT GACAACTT C C GT T CC AGT AAT AAAT AC AGC T C A 1560 
T CT T GAAAGAAGAACT ATT CACT GT AT T T CAT T T T CT T TAT AT T GGAC C GAAGT CAT T AA 1620 

I I I M M I I II I I I I I I I I I I I I I M I I I I I I I I M I I I M I M I I I I I I I I I I I I I I II 

T C T T GAAAGAAGAACT ATT CACT GT AT T T CAT T TT CT T TAT ATT GGAC C GAAGT CAT T AA 162 0 
AACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC AC AGC ACACT AT 168 0 

I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC AC AGC ACAC TAT 168 0 
TAAAAT ATT AAGT GT AAT TAT T T T AAC ACT C ACAG C T AC AT AT GAC AT T T TAT GAG CT GT 1740 



1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 
1741 T T AC GGC AT GGAAAGAAAAT C AGT G GGAAT T AAGAAAGC CT C GT C GT GAAAGC ACT TAAT 1800 

| | M | I I M I I I I I I I I I I I > M I I 1 I I I I I I I I I I I M I I I M I I M M I I I I I I I I I I 

1741 T T AC GGCAT G GAAAGAAAAT C AGT GG GAAT T AAGAAAGC CT C GT C GT GAAAGC ACT TAAT 1800 

18 01 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 18 60 

| Ml I I I I I I I I II I I I II I I I I I I I Mill 

18 01 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 18 60 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

| | | I II I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I M 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 198 0 

| | | | | | | | | | || | I I I I I II I I M I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAACAGAACTT T T AAAT G 1980 

1981 AAGCT T AAAT TACT CAAT T T AAAAT TT T AAAAT C CT T T AAAACAACT T T T CAAT TAAT AT 204 0 

I I I | i 1 I I I I I I I I 1 I 1 I I 1 I I I I t I I I I I I 1 I I M I I I 1 I I I ! I I I I I I I I I I I 

1981 AAGCT T AAAT TACT CAAT T T AAAAT TT T AAAAT C CT T TAAAACAACT T T T CAAT TAAT AT 2 040 

2041 TAT CAC ACT ATTAT CAGATT GTAATT AGAT GCAAATGAGAGAGCAGTT TAGTT GTTGCAT 2100 

| | | | | | | | | I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I M I I I M I I I I I 
2041 TAT CACACT ATTAT CAGATT GTAATT AGAT GCAAAT GAGAGAGCAGTTTAGTT GTT GCAT 2100 

2101 T T T T C GGACACT GGAAAC AT T T AAAT GAT CAG GAGGGAGT AACAGAAAGAGCAAGGCT GT 2160 

| M I I I II I I I I I II I M I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I M I I I I II I I 
2101 T T T T C GGACACT G GAAACAT T T AAAT GAT CAG GAGG GAGT AACAGAAAGAGCAAGG CT GT 2160 

2161 T T T T GAAAAT CAT T AC ACT T T CAC T AGAAGC C CAAAC CT C AGCATT CT GCAAT AT GTAAC 2220 

| M M | | | | I I || I II I I II I I I I I I II I I I I I II I I I I I I I II I I I I I I I I I I I I I I M 
2161 T T T T GAAAAT CAT T ACACT TT CACT AGAAGC C CAAAC CT C AGCATT CT GCAAT AT GTAAC 2220 

2221 CAACAT GT C ACAAACAAGC AGC AT GT AACAGACT GGCAC AT GT G C C AGCT GAAT T T AAAA 228 0 

| | M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I M 
2221 CAACAT GT C ACAAACAAGC AGC AT GT AACAGACT GGCACAT GT GCC AGCT GAAT T T AAAA 2280 

2281 TATAATACTTTTAAAAAGAAAATT ATT ACAT C CTTTACATT CAGTTAAGAT CAAACCT CA 2340 

| | | | | | | | || I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I II I M I I I I I I M I 
2281 TATAATACTTTTAAAAAGAAAATT ATTACAT C CTTTACATT CAGTTAAGAT CAAACCT CA 2340 

2341 C AAAGAGAAATAGAAT GT T T GAAAG GCT AT C C CAAAAGACT T T T T T GAAT CT GT CAT T CA 2400 

| | | | I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I 
2341 C AAAGAGAAATAGAAT GT T T GAAAGGCT AT C C CAAAAGACT T T TTT GAAT CT GT CAT T CA 2400 

24 01 CAT AC C CT GT GAAGACAAT AC TAT CT ACAAT T TT TT CAG GAT TAT T AAAAT CTTCTTTTT 2460 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I II I I I I I I I I 
24 01 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 24 60 

24 61 T CACT AT CGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT ACT TACCT ACAT ACA 2520 

I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I 
24 61 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2521 CT G CAT GT AGAT GAT T AAAT G AGG GC AG GCCCTGTGCT CAT AGC T T T AC GAT G GAGAGAT 2580 
I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M 



Db 


2521 


CT GCAT GT AG AT GAT T AAAT GAGG G C AG GC C CT GT GCT CAT AGCT T T AC GAT G GAGAGAT 


2580 


Qy 


2581 


GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

| | | | | | | I 1 I I 1 1 II 1! 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 II 

GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 


2 640 


Db 


2581 


2640 


Qy 


2641 


GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

| | | | | | M 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 


2700 


Db 


2641 


2700 


Qy 


2701 


GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

Mill Mill II 1 1 1 1 1 1 1 M 1 M 1 1 1 1 M 1 1 II 1 1 1 II 1 1 

GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 


2760 


Db 


2701 


2760 


Qy 


2761 


TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

MM 1 M 1 1 1 M M 1 1 1 1 1 II II 1 M M 1 1 1 1 1 M 1 II 1 1 1 1 1 

TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 


2820 


Db 


2761 


2820 


Qy 


2821 


ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

M II 1 1 1 1 1 M M II 1 II 1 1 1 1 I 1 II II II II 1 II II 1 M M 1 1 1 1 II M 1 1 1 

ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 


2880 


Db 


2821 


2880 


Qy 


2881 


AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

|| II 1 M 1 1 1 II 1 1 II 1 M 1 1 II M M II 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 

AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 


2940 


Db 


2881 


2940 


Qy 


2941 


GGGAT GAGAT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAGAG GAAAT GAG GT G 

M 1 1 II 1 1 II M 1 1 1 II II 1 1 II 1 M M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 

G GGAT GAGAT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 


3000 


Db 


2941 


3000 


Qy 


3001 


GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

M | | || 1 II 1 II 1 II II 1 M 1 M M II 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 M II 1 II 

GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 


3060 


Db 


3001 


3060 


Qy 


3061 


CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 
M 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M II 1 1 1 1 1 
C GT CACAT CAAT GCAAAAGGT C CT GAT T T T GT T C C AGCAAAAC AC AGT GCAAT GT T CT CA 


3120 


Db 


3061 


3120 


Qy 


3121 


GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II II 1 1 M 1 1 1 1 
GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 


3180 


Db 


3121 


3180 


Qy 


3181 


ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

M II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 N 1 M 1 M 1 II 1 1 1 1 1 1 M 1 1 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 


3240 


Db 


3181 


3240 


Qy 


3241 


TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGT AAAC C AAAAC C CAACAAT GT G G C CA 

M 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 M II 1 II M 1 1 1 1 1 1 1 M 1 1 1 M 1 II 1 1 1 1 1 1 M 1 1 1 M 

TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGT AAAC C AAAAC C CAACAAT GT GGC CA 


3300 


Db 


3241 


3300 


Qy 


3301 


GAAAGAAAG AGCAAT AATAAT T AAT T C ACACAC CAT AT GGAT T CT AT T TAT AAAT C AC C C 
M M II 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 M 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 
GAAAGAAAGAGC AAT AAT AAT T AAT T C AC AC AC CAT AT G GAT T CT AT T TAT AAAT C AC C C 


3360 


Db 


3301 


3360 


Qy 


3361 


AC AAACT T GT T C T T T AAT T T CAT C C CAAT C ACT T T TT CAGAGG C CT GT T AT C AT AGAAGT 

M II M 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T TT CAGAGG C CT GT T AT CAT AGAAGT 


3420 


Db 


3361 


3420 



Qy 


3421 


CAT T T T AGACT C T CAAT TT T AAAT T AAT T T T GAAT CACT AAT AT T T T C AC AGT T TAT T AA 

I | | || | | | | | | | I I 1 I 1 1 II 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 i M II 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 M 1 

CAT T T T AGACT C T CAAT TT T AAAT T AAT T T T GAAT CACT AAT AT T T T C AC AGT T TAT T AA 


3480 


Db 


3421 


3480 


Qy 


3481 


TAT AT T T AAT T T CT AT T T AAAT T T T AGAT T AT TT T TAT T AC CAT GT ACT GAAT T T T T AC A 

M | 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 II II 1 1 1 1 1 1 

TAT AT T T AAT T T CT AT T TAAAT T T T AGAT T AT TT T TAT T AC CAT GT ACT GAAT T T T T AC A 


3540 


Db 


3481 


3540 


Qy 


3541 


TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

| | | | || | | | | I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N II 
TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 


3600 


Db 


3541 


3600 


Qy 


3601 


TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

| | | | | | | I I I 1 1 1 1 M 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 
TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 


3660 


Db 


3601 


3660 


Qy 


3661 


TTTAAAAAAAAT GTTT GATT CAAAACTTTAACAT ACTGATAAGTAAGAAACAAT TAT AAT 

M 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 II M M M 1 1 1 1 1 1 1 

TT T AAAAAAAAT GTTT GAT T C AAAACT T T AAC AT ACT GAT AAGT AAGAAAC AAT TAT AAT 


3720 


Db 


3661 


3720 


Qy 


3721 


TT CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT G CT AT C GT T CAACT T CAAAAC AT GT 

M 1 1 1 1 1 1 1 M 1 M 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 11 1 1 1 1 

T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAG GT GCT AT C GT T CAACT T CAAAAC AT GT 


3780 


Db 


3721 


3780 


Qy 


3781 


TT C CT AGT ATT AAG GACT T T AAT AT AGCAAC AGACAAAAT TAT T GT T AAC AT G GAT GTT A 

| | | | | | | | | | | | 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TT C CT AGT ATT AAG GACT T T AAT AT AGCAAC AGACAAAAT TAT T GT T AAC AT GGAT GT T A 


3840 


Db 


3781 


3840 


Qy 


3841 


CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

| | | | | | | | M 1 1 1 1 II M II 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


3900 


Db 


3841 


3900 


Qy 


3901 


GT GGAT GT AT GT T CAAACAC CT T TT AGT AT T G AT AGCT T ACAT AT GGC CAAAGGAAT AC A 

| | | I I I M | M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 

GT G GAT GT AT GTT CAAACAC CT T T T AGT AT T GAT AGCT T AC AT AT G GC CAAAGGAAT AC A 


3960 


Db 


3901 


3960 


Qy 


3961 


GTT TATAGCAAAACAT GGGT AT GCTGT AGCT AACTT TAT AAAAGT GTAATATAACAAT GT 

1 | | | | | | | | | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II M 1 1 1 M 1 M 1 M 

GTT TATAGCAAAACAT GGGT AT GCTGT AGCT AACTT TAT AAAAGT GTAATATAACAAT GT 


4020 


Db 


3961 


4020 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

| | | | | | | | M 1 II 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 II 1 II 1 1 1 1 1 M M 1 1 1 1 II II 1 1 
AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


4080 


Db 


4021 


4080 


Qy 


4081 


TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

| | | | | | | | | | I I I I 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 
TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 


4140 


Db 


4081 


4140 


yy 


4141 


ACT GTACAGACACTAATT CATTAAAT ACT AATT GATT GTTTAAAAGAAATATAAAT GT GA 

| | | | | | I I I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 

ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 


4200 


Db 


4141 


4200 


Qy 


4201 


C AAGT G G AC AT TAT T TAT GT TAAAT AT AC AAT TAT C AAG C AAGT AT GAAGT TAT T CAAT T 

| | 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 

C AAGT G G AC AT TAT T TAT GT TAAAT AT AC AAT TAT C AAG C AAGT AT GAAGT TAT T CAAT T 


4260 


Db 


4201 


4260 



Qy 42 61 AAAATGCCACATTTCTGGTCTCTGGG 4286 

! I I I I I II I II M I I I I M I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 9 
ADD18443 

ID ADD18443 standard; DNA; 4286 BP. 
XX 

AC ADD18443; 
XX 

DT 15-JAN-2004 (first entry) 
XX 
DE 
XX 
KW 
KW 



Human prostate cancer diagnosis related DNA sequence SeqID15. 



prostate tissue; cancer diagnostic; cancer marker; prostate cancer; PCA; 
male cancer-related death; serum biomarker; tissue biomarker; cytostatic; 

KW gene therapy; prostate biopsy tissue; AMACR; 

KW alpha-methylacyl-coenzyme A racemase; diagnosing cancer; cell growth; 

KW human; ds . 

XX 

OS Homo sapiens. 
XX 

PN WO2003012067-A2. 
XX 

PD 13-FEB-2003. 
XX 

PF 02-AUG-2002; 2002WO-US024567 . 
XX 

PR 02-AUG-2001; 2 001US-0309581P . 

PR 15-NOV-2001; 2001US-0334468P . 

PR 01-AUG-2002; 2002US-00210120 . 
XX 

PA (UNMI ) UNIV MICHIGAN. 
XX 

PI Rubin MA, Chinnaiyan AM, Sreekumar A; 
XX 

DR WPI; 2003-278396/27. 
XX 

PT Characterizing prostate tissue comprises providing a prostate tissue 

PT sample from a subject and detecting the presence or absence of expression 

PT of hepsin, pim-1 or EZH2 . 

XX 

PS Disclosure; SEQ ID NO 15; 297pp; English. 
XX 
CC 

cc 



This invention relates to a novel method of characterising prostate 
tissue in a subject and to compositions and methods for cancer 
CC diagnostics, including cancer markers, in particular prostate cancer. 
CC Prostate cancer (PCA) is a leading cause of male cancer-related death. 
CC Additional serum and tissue biomarkers would aid diagnosis. The invention 
CC may provide means of producing compounds with a cytostatic activity or 
CC allow the development of gene therapy. The methods of the invention 
CC useful for characterising prostate tissue in a subject, screening 
CC compounds, characterising inconclusive prostate biopsy tissue in a 
CC subject, detecting AMACR (alpha-methylacyl-coenzyme A racemase) 
CC expression in a bodily fluid, characterising tissue in a subject, 
CC diagnosing cancer in a subject and inhibiting the growth of cells. The 



cc 
cc 


present sequence is a DNA sequence which is preferably utilised in the 
method of the invention. 




XX 
SQ 


Sequence 


4286 BP; 1327 A; 829 C; 816 G; 1314 T; 0 U; 0 Other; 




Query Match 99.6%; Score 4284.4; DB 9; Length 4286; 
Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 


0; 


Qy 


1 


GAGACAT T C C G GT GG GGGACT C T G G C C AGC C C GAGCAAC GT GGAT C CT GAGAGC ACT C C C 
i . i i i i i i i i i i i i i i i i i i i i i i i j i i i i i i i i i i 1 1 i i i | 1 l 

| | | | | | | M 1 II 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 M II II II II M II II II ll li I li l II i 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 


60 


Db 


1 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 
, . . , , i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 

I | | M | | || 1 1 1 1 1 1 1 1 1 M 1 II II 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M l l l l l l 1 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


61 


120 


Qy 


121 


AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 
, , i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i t I I I I I I I 

| I | | | | | M 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 M 1 1 II 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTTGGAGT CT GGACAT CT GA 


180 


Db 


121 


180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

, , , . , i i * i i i i i i i i i i i i i i i i i i i i i i i i t 1 1 1 1 1 1 1 

| | | | | | | | 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 

AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


181 


240 


Qy 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

i i i i i i i i t i i i i i i i i i i i i i i i i t i i I I I I I I l l 1 

| | | | | | | | | | | | | | | | | | | | | | | | | | M II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 


Db 


241 


300 


Qy 


301 


TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

( t i i i i i i i i i i i i i i i i i i i i i i t I I I I I I 1 1 1 1 1 1 1 

| | | | | | M 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M M M M 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 


360 


Db 


301 


360 


Qy 


361 


ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

> i i i 1 i 1 1 1 t 1 1 1 1 1 1 1 I 1 1 ( 1 I 1 1 I 1 1 1 1 1 1 1 1 t 1 1 1 1 1 

| | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 M 1 II 1 1 1 1 1 M II M II 11 1 M M 1 1 I 1 1 1 I I 
ACC G CAGAGAT AAT GAC GC CAC C CACT AAGAC CT T AT GGC C CAAGGGT T C CAAC GC C AGT 


420 


Db 


361 


420 


Qy 


421 


CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 | 1 1 1 1 1 1 1 1 1 | 1 

| | | | | | | | | | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I' 1 11 11 1 11 
CT GGC GCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGAC GGC AGGAT CTCCG 


480 


Db 


421 


480 


Qy 


481 


C CAC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAG GAGACT T T CAAA 

1 | | | | | | | M 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 M 1 1 M 1 M 

C CAC G CAC CAT CTCCCCTCCCCCGTGC CAAGGACC C AT C GAGAT CAAGGAG ACT T T CAAA 


540 


Db 


481 


540 


Qy 


541 


TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 

1 I 1 | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M II 1 1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 

TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 


600 


Db 


541 


600 


Qy 


601 


CTT CT GAGAAT TAT CT ACAAGAACAAGT GCAT GC GAAAC GGT C C CAAT AT CT T GAT C GC C 

1 | | | | | || | | || | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 M 1 M 

CTT CT GAGAAT TAT CT ACAAGAACAAGT G CAT GC GAAAC G GT C C CAAT AT CTT GAT C GC C 


660 


Db 


601 


660 


Qy 


661 


AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 

|| | | | | | | M | | | | | | | I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 
AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 


720 


Db 


661 


720 



721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 

1 1 1 1 1 1 1 M 1 1 1 1 ii i 1 1 1 1 1 1 1 1 i liiliiilliiiliiili 

721 " "~ ~ ~~ ^ 



CT GCT GG C AGAGGACT G G C C ATT T G GAG CT GAG AT GT GT AAG CT GGT G C CT T T CAT AC AG 7 80 



781 AAAG C C T C C GT G GGAAT CACT GT GC T GAGT CT AT GT GCT CT GAGT AT T GAC AG AT AT C GA 84 0 

| M M I I I I I I M I I I I I I I M I M I I I I I I I I I II I I I I I I I I M I I I I I I I I 

781 AAAG C C T C C GT G GGAAT CACT GT GCT GAGT CT AT GT G CT CT GAGT ATT GAC AG AT AT C GA 



8 41 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

| | | | | | || | | | | | | II I I I I I I I I M I I I I I I I I I I I I I I I M II M I I I I I IN 

841 GCTGTTGCTTCTT GGAGT AGAAT T AAAGGAAT TGGGGTTC CAAAAT G G AC AGC AGT AGAA 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

I I I I | | M I I I I I I I I I I I I I I I I I I M I II I I M I I I I M M I I I I I I I M I I I I I I I I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



840 
900 
900 
960 
960 



961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 102 0 

| | | | | | | M I I I | I II I I I II II I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I 

961 ATAATTACGATGGACTAC7WVGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 

1021 AAGAC AG CT T T CAT GC AGT T T T ACAAG AC AGCAAAAGAT TGGTGGCTGTT C AGT T T CT AT 1080 

I | | | | | | | | | | I I M I II I II I I I I I I M I I II I I M I I M I I I I I I I I I I M II 

AAGAC AGCT T T C AT GC AGT T T TACAAGAC AGCAAAAGAT TGGTGGCTGTT C AGT T T C TAT 



1021 
1081 
1081 



TTCTGCTTGC CAT T GGC C AT CACT GC AT TT T T T T AT AC ACTAAT GAC CT GT GAAAT GT T G 

M M I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I II 

TTCTGCTT GCC AT T GGC CAT CACT GC AT TT T T T TAT AC ACTAAT GAC CT GT GAAAT GT T G 



1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 

|| | I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I M I I I I 

1141 AGAAAGAAAAGT GG CAT GCAGAT T GCT TT AAAT GAT CAC CT AAAG C AGAGAC GGGAAGT G 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 
| | M II I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



1080 
1140 
1140 
1200 
1200 
1260 
1260 



1261 AGC AGGAT T CT GAAGCT CAC T CT T T AT AAT C AGAAT GAT C C CAAT AGAT GT GAACTT T T G 1320 

Mill | | | | M M M I I I I I I I I II I I I I I I I I I I I I I M I M I I II 

AG C AGGAT T CT GAAG CT CACT CT T TAT AAT C AGAAT GAT CC CAAT AGAT GT GAACTT T T G 



1261 



1320 



1321 AGCT T T CT GT T GGTAT T G GACTATAT TGGTATCAACATG GCT T CACT GAATTCCTGC ATT 1380 

| | | M M I I I I I I I I I I I I I I I I I I I I I I I I I N I I I M I I i , I M I I U ! M I I 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1321 
1381 
1381 



AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTT7WVGTCATGCTTA 

| | M | | | || | | | | I I I I II II I I I I I I I I I I I M I M I II I I I I I I I I I I 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| I I I I I I I I I I I I I I I I I I I M I I II I I I I I M I I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

| M I M I I I I I I I II I I I II M I M I I I I 1 I 1 i I I 1 I ! I I I I I I I I I I I I 1 1 I I 

1501 AAGT T C AAAG CT AAT GAT C AC GG AT AT GAC AACT T C C GT T C C AGT AAT AAAT AC AG CT C A 



1380 
1440 
1440 
1500 
1500 
1560 
1560 



1561 T CT T GAAAGAAGAACT AT T CAC T GT AT T T C ATT T T CT T T AT AT T G GAC C GAAGT CAT T AA 1620 



I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

Qv 1621 AAC AAAAT G AAAC AT T T G C C AAAAC AAAAC AAAAAAC T AT GT AT T T G C AC AG C AC ACT AT 1680 

M MINI IIIIIIIIIIMIIM Mill 

Db 1621 AACAAAATGAAACATTTGCCAAAA.CAAAACAAAAAACTATGTATTTGCACAGCACACTAT 1680 

Qy 1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | | | | | | || M | | | | | I I I I I I I II I II I I I I I II I I I M I M M I I IN 

Db 1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

Qv 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

| | M II I I I I I M M I I I M I I I I I II I I I I I I I I I I I I I I I 

Db 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

Q V 1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

|| | | | | | | | I I I I I I M I I I I I I I I I M II I I I I 

Db 1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAA.CTTCCAGGATATTCACACAACACT 1860 

Ov 1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

Ml | | | | | | | I II II I I I I I I I I II I I I I I M I I I I M I I I I 

Db 1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

Q V 1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

MINIMI MIMI II I I I M I I I I I I M N I I I I Ill 

Db 1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

Q V 1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 204 0 

|| | | || | || | I M I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I II 

Db 1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 2040 

Q V 2041 TAT C ACACT AT TAT CAGAT T GT AATT AGAT GCAAAT G AGAGAG C AGTT T AGTT GTT GC AT 2100 

| | I II | | M I I I I I I I I I I I I I M I I I I I II I I I I I I I M I I I I I I II M I I I I I 

Db 2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 2100 

QV 2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

|| | | | | || II II I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2101 T TTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 2160 

Q V 2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

| | | | | I I M I I I I I I I I I M I I I M I I I II I M I M I I I I I I I 

Db 2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

Q V 2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 

|| | || I I I I I M I II I I I I I I I M II I I I I I I M I I I I I I 

Db 2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 

Qy 2281 T AT AAT ACT T T T AAAAAGAAAATT AT T ACAT CCT T T ACATT C AGT T AAGAT C AAACCT C A 234 0 

| | | | | || I II I I I I I I I I I I I I I I I M I M I I I I I I M I I M M I II I I I I II I I I I I I I 
Db 2281 TAT AAT ACT T T T AAAAAGAAAATT AT T ACAT C CT T T ACATT C AGT T AAGAT C AAAC CT C A 2340 

Qy 2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

| | | | M I I I M II I I M M I M I I I I I I I I I I I M I I I M I I I I I I I I I II I M I I I M I 
Db 2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

Qy 2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

| | | | || | | | II I I I I I I I I I I I I I I II M I I I I I I II M I I I I I I I M I I I I I I I I I I I I 



2401 CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T T T C AGGAT TAT T AAAAT CTTCTTTTT 24 60 
24 61 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 252 0 

I I M M I I 1 I I I I I I I I M M I I I I I I 1 M 1 I I M 1 I I M I I I I I I I I I I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 258 0 

| | | || | | | | | I M II I i I I I I I I M I I I I I I MINI I II M I I 

2521 CT G CAT GT AGAT GAT T AAAT GAGGGC AGG C CCTGTGCT C AT AGCT T T AC GAT GGAGAGAT 2580 

2581 GC CAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GACAAAGGG 2640 

| | I I I II M I I II I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2581 GCCAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GACAAAGGG 2640 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 

| | | | | I I I I I I I I I I I I M I I I I II I I I I I I I I I I I M I M I I I M I I I I 

2 641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 27 00 

27 01 GCT AT AGT T AAAAT AC TAT T T T T C AAAAT CAT AC AGAT T AGT AC AT T T AAC AG CT AC CT G 2760 
| | | | M | | | | I I I I I I I I I I I I I I II I I M I I II I I I I I I I I I I I I I M I I I I I I I I I I I 

27 01 GC TAT AGTT AAAAT AC TAT T T T T CAAAAT C AT ACAGAT T AGT ACAT T T AACAGCT AC CT G 27 60 

2761 T AAAGCT TAT T ACTAAT T T TT GT AT T AT TT T T GT AAAT AGC C AAT AGAAAAGT T T GCT T G 2 82 0 

| | | | || | | | || I I I I I II I I II II I I I I I I I I M I M I I I I M M II I I I I I II I 

2761 TAAAGCTTATTACTAATTTTT GT ATT ATTTTTGT AAAT AGCCAATAGAAAAGTTTGCTTG 2820 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 

| | | | | | | | | | | | || I I II I I I I 1 I I I I II I I I I I I I M I II I I I I I I I I I I I I I I I I I I I 
2821 ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 

2 881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTT AGGAT AGCTT 2940 

| M I I I I I I I M I I I I I I I I II I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

28 81 AGCT T T GT GC GT T C CT GC CTAAT T T TT AT AT CTTCTAAGCAAAGTGCCTT AGGAT AGCTT 2 94 0 

2941 GGGAT GAGAT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 30 00 

|| | | || | | | | | I I I I II I I I I I M M I I I I I I I I I I I I I I I I I I M 

2941 GGGAT GAGAT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 3 000 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

| M | | | | II I I I I I I M M I I I I I I I I I M I I I I I 

3001 GGGT T GGAG GAAAC C CAT GGGGAC AGAT T C C CAT T CT T AGC CTAAC GT T C GT CAT T G C CT 3060 

3061 C GT CAC AT C AAT GCAAAAGGT C CT GAT T T T GT T C C AGCAAAACACAGT G CAAT GT T CT C A 3120 

| | | | | | | | | || | | | | I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3061 C GT CAC AT CAAT GCAAAAGGT C CT GAT T T T GT T C C AGCAAAACACAGT G CAAT GT T CT C A 3120 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTT AAAAT AT GCCCAA 3180 

M M I I I I I I I I I I I I M I M I I I I I I I I I I I M I I I I I I I M I I I I I I I II I I I I I I I I 
3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTT AAAAT AT GCCCAA 318 0 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

| | | | M M I I I I I I I I I I I II I M I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I 
3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 3300 

| | | | | | | | M | || || I I I II I I I II I I I II I I I I M I I M I I M I I I I M I I M I I I I I I 
3241 TTGTTTTCTGT CAAT ATT GAAT GT GAT G GT ACAGT AAAC CAAAAC C CAACAAT GT GGC C A 3300 



Qy 


3301 


GAAAGAAAGAGCAAT AAT AAT T AAT T C AC AC AC CAT AT G GAT T C TAT T T AT AAAT CAC C C 

i i i i i i i i i i i i i i i i i i i i i i t i i 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 I 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 

GAAAGAAAGAGCAAT AAT AATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 


3360 


Db 


3301 


3360 


Qy 


3361 


ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 

i i i i i i i i i i 1 1 1 1 1 1 1 1 1 1 t 1 

1 1 | I I I | M | 1 1 1 1 1 M 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 

ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T C AGAGGC CT GT T AT C AT AGAAGT 


3420 


Db 


3361 


3420 


Qy 


3421 


CAT T T T AGAC T CT C AAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T C AC AGT T TAT T AA 

I I M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 M 1 1 

CAT T T T AGACT CT CAAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T CAC AGT T TAT T AA 


3480 


Db 


3421 


3480 


Qy 


3481 


TAT AT T T AAT T T CT AT T T AAAT T T T AGAT T ATT T T TAT T AC CAT GT ACT GAAT T T T T AC A 

| | | | | | I I I I 1 1 1 1 II 1 1 I II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 

TAT AT T T AAT T T CT AT T T AAAT T T T AGAT T ATT T TT AT T AC CAT GT ACT GAAT T T T T AC A 


3540 


Db 


3481 


3540 


Qy 


3541 


TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

| | | | | | | | | | II 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 II 
TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 


3600 


Db 


3541 


3600 


Qy 


3601 


T GAAACT AC AC ACAAAAAGCAT AC T T G CAT TAT T T AT AAT AAAAT T G CAT T C AGT GGCT T 

| | | | | I 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

T GAAACT AC AC ACAAAAAGCAT AC T T G CAT TAT T TAT AAT AAAAT T GC AT T C AGT GG CT T 


3660 


Db 


3601 


3660 


Qy 


3661 


TTTAAAAAAAAT GTTT GATT CAAAACT TTAACATACT GATAAGTAAGAAACAATT ATAAT 

| | | | | | | | || | | 1 1 II 1 1 II 1 1 1 II 1 II 1 1 1 1 1 M 1 1 II II 1 1 1 II 1 1 1 II M 1 1 1 1 1 1 1 

TTTAAAAAAAAT GT TT GATT CAAAACT TTAACATACT GATAAGTAAGAAACAATT ATAAT 


3720 


Db 


3661 


3720 


Qy 


3721 


T T CT TT AC AT AC T C AAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT T CAAAACAT GT 

| | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 II 1 

T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCTAT C GT T CAACT T CAAAACAT GT 


3780 


Db 


3721 


3780 


Qy 


3781 


TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

| | M | | | | || | | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 II II 1 II M 1 1 1 1 1 1 1 1 1 M M 1 
T T C CT AGT AT T AAGGACT T T AAT AT AG C AAC AGACAAAAT TAT T GT T AAC AT GGAT GT T A 


3840 


Db 


3781 


3840 


Qy 


3841 


CAGCT CAAAAGAT T T ATAAAAGAT T T T AAC CT AT T T T CT C C CT T AT TAT C CACT GCTAAT 
| | | | | | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 
CAGCT CAAAAGAT T T ATAAAAGAT T T T AAC CT AT TT T CT C C CT T AT TAT C CACT G CTAAT 


3900 


Db 


3841 


3900 


Qy 


3901 


GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T G AT AGCT T ACAT AT GG C C AAAGGAAT AC A 

| | I I I I M | 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCTT AC AT AT GG C CAAAGGAAT ACA 


3960 


Db 


3901 


3960 


Qy 


3961 


GT T T AT AGCAAAACAT GGGT AT GCTGTAGCTAACTTTATAAAAGT GTAAT ATAACAAT GT 

| | | | | | | | || I I 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GTT TAT AG CAAAACAT GGGT AT GCT GT AGCT AACTT TAT AAAAGT GTAAT AT AACAATGT 


4020 


Db 


3961 


4020 


Ov 


4021 


AAA7WVTTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

| | M II 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


4080 


Db 


4021 


4080 


Qy 


4081 


T T TAT TAT GT AAG CAAAAC C AAT AAAAAT T T AAGT T TT T T T AAC AACT AC CT T AT T T T T C 

| | | | | | | 1 1 | 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 II 1 II 1 1 1 1 1 1 1 1 1 M 1 

T T TAT TAT GT AAG CAAAAC CAAT AAAAAT T T AAGT T T T T T T AAC AACT AC CT TAT T T T T C 


4140 


Db 


4081 


4140 



Ov 4141 ACT GT ACAGACACTAATT CATTAAATACTAATT GATT GTT TAAAAGAAAT ATAAAT GT GA 4200 

| | | | | | | | M | | I I I I I 1 I 1 I I I I I I I I I I II I I I M M IMMII IN 

Db 4141 ACT GT ACAGACACTAATT CATTAAATACTAATT GATT GTTTAAAAGAAAT ATAAAT GT GA 4200 

Ov 4201 CAAGT GGACATT ATT TAT GTTAAAT ATACAATTAT CAAGCAAGT AT GAAGTTATT CAATT 4260 

MIIIMIMI I M 1 I I I I I I I M M I I I 1 I M IMMMIIMMIII 

Db 4201 CAAGT GGACATT ATT TAT GTTAAAT AT ACAATT AT CAAGCAAGT AT GAAGTTATT CAAT T 4260 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

I I I I I M I I I I I I I I I I I I I I I I M I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



Human low adenosine antisense oligonucleotide related sequence #2855. 



RESULT 10 
AAF21288 

ID AAF21288 standard; DNA; 13611 BP. 
XX 

AC AAF2128 8; 
XX 

DT 14-MAR-2001 (first entry) 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 



Low adenosine antisense oligonucleotide; phosphorothioate; allergy; 
human; airway disorder; bronchoconstriction; lung inflammation; 
surfactant depletion; respiratory; bronchodilator ; antiinflammatory; 
immunosuppressive; antiasthmatic; analgesic; hypotensive; cytostatic; 
respiratory obstruction; pulmonary obstruction; impeded respiration; 
surfactant hypoproduction; pulmonary vasoconstriction; asthma; RDS; 
respiratory distress syndrome; pain; cystic fibrosis; allergic rhinitis; 
pulmonary hypertension; emphysema; pulmonary transplantation rejection; 
chronic obstructive pulmonary disease; pulmonary infection; bronchitis; 
KW cancer; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200062736-A2. 
XX 

PD 26-OCT-2000. 
XX 

PF 24-MAR-2000; 2000WO-US008020 . 
XX 

PR 06-APR-1999; 99US-0127 958P . 
XX 

PA (UYEC-) UNIV EAST CAROLINA. 

PA (NYCE/) NYCE J W. 

XX 

PI Nyce JW; 
XX 

DR WPI; 2000-679539/66. 
XX 

PT Low adenosine (A) content antisense oligonucleotides which do not trigger 
PT adenosine receptors during metabolism, useful e.g. for treating cancers 
PT and respiratory obstructions. 
XX 

PS Disclosure; Page 1277-1280; 1592pp; English. 
XX 



CC The present invention describes low adenosine (A) content antisense 

CC oligonucleotides and compositions (I) comprising them. In the antisense 

CC oligonucleotides the A is replaced by a 'Universal' or alternative base. 

CC (I) can have respiratory, bronchodilator , antiinflammatory, analgesic, 

CC immunosuppressive, antiasthmatic, hypotensive and cytostatic activities. 

CC The antisense oligonucleotides and (I) can be used to down-regulate the 

CC expression and or activity of target polypeptides associated with 

CC lung/respiratory disorders and malignancies, such as stimulating and 

CC activating peptide factors and transmitters, transcription factors, 

CC immunoglobulins and antibodies, antibody receptors, cytokines and 

CC chemokines, endogenously produced specific and non-specific enzymes, 

CC binding proteins, adhesion molecules and their receptors, cytokine and 

CC chemokine receptors, adenosine receptors, bradykinin receptors, central 

CC nervous system (CNS) and peripheral nervous and non-nervous system 

CC receptors, CNS and peripheral nervous and non-nervous system peptide 

CC transmitters, defensins, growth factors, vasoactive peptides and 

CC receptors, binding proteins and malignancy associated proteins. The 

CC antisense oligonucleotides may be used in this way to treat disorders 

CC including respiratory obstruction (especially pulmonary obstruction 

CC and/or bronchoconstriction) and/or lung inflammation, allergy (ies) and/or 

CC surfactant hypoproduction which are associated with a disease or 

CC condition selected from pulmonary vasoconstriction, inflammation, 

CC allergies, asthma, impeded respiration, respiratory distress syndrome 

CC (RDS ) , pain, cystic fibrosis (CF) , allergic rhinitis (AR) , pulmonary 

CC hypertension, emphysema, chronic obstructive pulmonary disease (COPD) , 

CC pulmonary transplantation rejection, pulmonary infections, bronchitis, 

CC and/or cancer. AAF18434 to AAF21543 represent human polynucleotide 

CC fragments and antisense oligonucleotides used in the exemplification of 

CC the present invention 

Sequence 13611 BP; 3676 A; 3007 C; 3056 G; 3868 T; 0 U; 4 Other; 

Query Match 99.6%; Score 4284.4; DB 3; Length 13611; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



XX 
SQ 



Qy 


1 


GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

M 1 1 1 1 1 1 1 i 1 1 II 1 I 1 1 1 1 1 1 1 M M 1 I 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 M 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 


60 


Db 


1873 


1932 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

| | | | | | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 II M 
AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


1933 


1992 


Qy 


121 


AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGTACTT GGAGT CT GGACAT CT GA 

M | | | | | | | | | 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 

AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 


180 


Db 


1993 


2052 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

| | | | | | | | | | I I I II 1 II 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


2053 


2112 


Qy 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

| 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill MM 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 


Db 


2113 


2172 



Qy 



301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 



Db 


2173 


MM MM Ml MMMMMM 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 




Qy 


361 


ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

M M M M M M M 1 M M M M M M M M M M 1 M M M 1 Ml 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 


/ion 


Db 


2233 


zz^z 


Qy 


421 


CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

M M M M M 1 M M M M M M M M M 1 M M M M M M M 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 


A Q C\ 

4 o U 


Db 


2293 


o o c o 
ZooZ 


Qy 


481 


C CAC GC AC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT C AAG GAGAC T T T CAAA 

M M M M M M M M M M M M M M M M M M M M 1 M M 1 M M M 1 

C CAC G CAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAG GAGACTT T CAAA 


54 1) 


Db 


2353 


2412 


Qy 


541 


TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 

M M M M M 1 M M M M M M M M M 1 M M 1 M 1 M M M M M M M 1 M M 1 M 

TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 


600 


Db 


2413 


2472 


Qy 


601 


CTTCTGAGAATTATCTACAAGAACAAGTGCATGCG7^AACGGTCCCAATATCTTGATCGCC 

M M M M M M M M M M M M M M M M MM M 1 M M 

CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 


660 


Db 


2473 


2532 


Qy 


661 


AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 

M M M M M 1 M M M M M M M M M M M M M M M M 1 M M M ! M M M M 1 

AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 


720 


Db 


2533 


2592 


Qy 


721 


CTGCTGGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCTGGTGCCTTT CAT ACAG 

M M M M M M M M M M M M M M M M 1 M M 1 M M M M M M M M M M M 

CTGCTGGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCTGGTGCCTTT CAT ACAG 


78 0 


Db 


2593 


2 odZ 


Qy 


781 


AAAGCCT CCGT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGTATTGACAGAT AT CGA 

M M M M 1 M 1 M M M M M M M M M M M M M M M M M M M M 1 M M M 1 

AAAGCCTC CGT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGT ATTGACAGAT ATCGA 


84 0 


Db 


2653 


2712 


Qy 


841 


GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAT^AATGGACAGCAGTAGAA 

M M M M M 1 M M M M M M M M M I M M M M M M 

GCTGTTGCTTCTT GGAGT AGAAT TAAAGGAAT T GG GGT T C CAAAAT GGAC AGCAGT AGAA 


900 


Db 


2713 


OTTO 

2772 


Qy 


901 


ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M 

ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 


A C C\ 

960 


Db 


2773 


o o o o 
2 06Z 


Qy 


961 


AT AAT T AC GAT G GACT AC AAAGGAAGT T AT C T GC GAAT CTGCTTGCTT CAT C C C GT T C AG 

M M M M M M M M M M M M M M M M 1 M M M M M 1 M M M M M M M M 

AT AAT T AC GAT GGACT AC AAAGGAAGT TAT C T GC GAAT CTGCTTGCTT CAT C C C GT T C AG 


1020 


Db 


2833 


o o o o 

z o y z 


Qy 


1021 


AAGAC AGCT T T CAT G CAGT T T T ACAAG AC AGCAAAAGAT TGGTGGCTGTT C AGT T T CT AT 

M M M M M M M M M M M M M M M 1 M M M M M M M M M M M M M M 1 

AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 


lUoU 


Db 


2893 


zy dz 


Ov 


1081 


TTCTGCTTGC CAT TGGCCAT CACT GCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

M M M M M M M M 1 M M M M M 1 M 1 M M M 1 M M M 

TTCTGCTTGC CAT T GG C CAT CACT G CAT T TT T T TAT AC ACT AAT GAC CT GT GAAAT GT T G 


1140 


Db 


2953 


3012 


Qy 


1141 


AGAAAGAAAAGT GGC AT GC AGAT T G C T T T AAAT GAT CAC C T AAAGCAGAGAC G G GAAGT G 

M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M 


1200 



3013 AGAAAG AAAAGT G G CAT G C AG AT T G C T T T AAAT GAT C AC C T AAAG C AG AG AC G G GAAGT G 3072 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

I | | M I II I I I I I I M I II II II I I I I II I I I I I I I I I I 

3073 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 3132 

1261 AGCAGGATT CT GAAGCT CACT CTTT ATAAT CAGAAT GAT CCCAATAGAT GT GAACTTT T G 1320 

| | | | | | | | | | I | | 1 I I II I II II I I I I I 1 I I I I I I I II I I I II I I I I I I I I 

3133 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 3192 

1321 AGC T T T CT GT T GGT AT T GGACT AT AT T GGT AT C AACAT GGC T T CAC T GAAT T C CT GC AT T 1380 

| | | M I I I I I I I I I I M M II II I I II I I II I II I I I I M M M I I I I II 

3193 AGCTTTCTGTT GGT ATT GGACT AT ATT GGT AT CAACATGGCTT CACT GAATTCCTGCATT 3252 

1381 AAC C CAAT T GCT CT GT AT T T GGT GAG CAAAAGATT CAAAAAC T GCT T T AAGT CAT G CTT A 1440 

| | | | M | M I I I II I I M I I I I I I I I I I I I I I I I I I I II M I I I I I I M M 

3253 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 3312 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| | | | | | | | | | I M I I I II I I I I I I II I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I 
3313 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 3372 

1501 AAGT T CAAAGCT AAT GAT CAC GGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGCT C A 1560 

| | | M M I I I I M I I I I I I I I I I I I I I I M I I I I I I I I M I I II I 

337 3 AAGT T CAAAGCT AAT GAT CAC GGAT AT GACAACTT C C GT T C CAGTAAT AAAT AC AGCT C A 3432 

1561 T C T T GAAAGAAGAACT AT T CACT GT AT T T CAT T TT CT T TAT AT T G GACC GAAGT CAT T AA 1620 

| | | | | | | | | | | M I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
3433 T CTT GAAAGAAGAACT AT T CACT GT AT T T CAT T T T CT TT AT AT T GGACC GAAGT CAT T AA 3492 

1621 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GTATTT GCACAGCACACTAT 1680 

| | | | | M | | I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I M Ill 

34 93 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GTATTT GCACAGCACACTAT 3552 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | | | | M | | | | I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I 
3553 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 3612 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

M II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I 
3613 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 3672 

18 01 T T T T T ACAGT TAG CACT T CAAC AT AG C T CT T AAC AACT T C C AGGAT AT T C AC ACAAC ACT 18 60 

| | | | | | I M I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

3673 T T T T T ACAGT TAG C ACT T CAAC AT AG CT CT T AACAACT T C C AGGAT AT T CAC ACAAC ACT 3732 

18 61 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

| | | | || | M || | I I I I I I I II I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
3733 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 3792 

1921 AAT CAAT GG GACT CT GAT AT AAAG GAAGAAT AAGT CAC T GT AAAACAGAACT T T T AAAT G 1980 

|| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I 
3793 AAT CAAT GGGACT CT GAT AT AAAG GAAGAAT AAGT CACTGTAAAACAGAACTTTTAAAT G 3852 

1981 AAGCT T AAAT T AC T CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T T C AATT AAT AT 2040 

| | | | I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I 
3853 AAGCT T AAAT T AC T CAAT T T AAAAT T T T AAAAT C C T T T AAAACAACT T T T CAATT AAT AT 3912 



Qy 


2041 


TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT G C AAAT GAGAGAG C AGT T T AGT T GT T G CAT 

I I I I | | I I I I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 


2100 


Db 


3913 


3972 


Qy 


2101 


T T T T C GGAC ACT GGAAAC AT T T AAAT GAT CAG GAG GG AGT AAC AGAAAGAGCAAGGCT GT 

| | | | | | | | M | | 1 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 

TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 


2160 


Db 


3973 


4032 


Qy 


2161 


TT T T GAAAAT CAT T AC ACT T T C ACT AGAAGC C C AAAC CT CAG CAT T CT GCAAT AT GT AAC 
| | | | | | | I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M M 1 M M II II 1 II 1 1 1 1 1 II 1 1 1 
TT T T GAAAAT CAT T AC ACT T T C ACT AGAAGC C CAAAC CT CAG CAT T CT GCAAT AT GT AAC 


2220 


Db 


4033 


4092 


Qy 


2221 


CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAT^ 

I M 1 II 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 II 1 M 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

CAACAT GT C AC AAACAAGC AGCAT GT AACAGACT GG C AC AT GT G C CAGCT GAATT T AAAA 


2280 


Db 


4093 


4152 


Qy 


2281 


TAT AAT ACT T T TAAAAAGAAAAT TAT T ACAT C C T T T AC AT T C AGT T AAGAT CAAAC CT C A 

I | | | | | | M I 1 1 1 1 II 1 M 1 II 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

TAT AAT ACT T T TAAAAAGAAAAT TAT T ACAT C CT T T ACAT T C AGT TAAGAT C AAAC CT C A 


2340 


Db 


4153 


4212 


Qy 


2341 


CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACTT T TT T GAAT CT GT CAT TC A 

| | | | | | I I I I I 1 1 M 1 1 II II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 

CAAAGAGAAAT AGAAT GT T T GAAAGG CT AT C C CAAAAGACTT T TT T GAAT CT GT CAT T C A 


2400 


Db 


4213 


4272 


Qy 


2401 


CAT AC C CT GT GAAGACAAT ACT AT CT ACAAT t T T TT CAG GAT T ATT AAAAT CT T CTT TT T 

| | | | | | | | | | II 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 

CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 


2460 


Db 


4273 


4332 


Qy 


2461 


TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

| | | | I I 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 
TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 


2520 


Db 


4333 


4392 


Qy 


2521 


CT GC AT GT AGAT GAT T AAAT GAGGGCAG GC C CT GT GCT C AT AGCT T T AC GAT GGAGAGAT 

i M 1 1 1 1 1 II 1 II 1 1 1 1 M II II 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

CT GCAT GT AGAT GAT T AAAT GAGGGCAG GC C CT GT GCT CAT AGCT T T AC GAT GGAGAGAT 


2580 


Db 


4393 


4452 


Qy 


2581 


GC C AGT GAC CT C AT AAT AAAGACT GT GAACT G C CT GGT GCAGT GT C C AC AT GACAAAGG G 

| | | I I I I I I M M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

GCCAGT GACCT CAT AAT AAAGACT GT GAACT GCCT GGT GCAGT GT C CACAT GACAAAGGG 


2640 


Db 


4453 


4512 


Qy 


2641 


GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

| | | | | | | M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 
GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 


2700 


Db 


4513 


4572 


Qy 


2701 


G CT AT AGT T AAAAT ACT AT T T T T C AAAAT C AT ACAGAT T AGT AC AT T T AAC AGCT AC CT G 

I | M | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 

GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 


2760 


Db 


4573 


4632 


wy 


2761 


TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

1 1 I I t 1 t I 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 I MIMMMM 1 1 1 1 1 1 1 1 1 1 M 1 1 

TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 


2820 


Db 


4633 


4692 


Qy 


2821 


ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCT CTT 

MUM M II M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 

ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACT GCT TTTTGAGACCGTAAGAACCT CTT 


2880 


Db 


4693 


4752 



2 881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 

| | | | I I | I I I I II I I I I I I I I II I I I I I I i I M M I I I I M I I I I I I I 

4753 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 4812 

2 941 GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 3000 

Ml I I I I I I I I I I I I II I I I II 

4 813 GGGAT GAGAT GT GT GT GAAAGTAT GT ACAAGAGAAAAC G GAAGAGAGAGGAAAT GAGGT G 4872 

3001 GGGT T GGAG GAAAC C CAT G GGGAC AGAT T C C CAT T CT T AGC CTAAC GT T CGT CAT T GC CT 3060 

| | | | | | I I I I I I I I I II I I I I II II I M II I I I I I I I I I I I I I I I M I 

4873 GG GT T GGAG GAAAC C CAT G G GGAC AGAT T C C CAT T CT TAG C CTAAC GT T CGT CAT T GC C T 4932 

3061 C GT CAC AT CAAT G CAAAAGGT C CT GAT T T T GT T C CAG CAAAAC ACAGT G CAAT GT T CT C A 3120 

| | | | | | || | | I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I M II M II 

4933 C GT CACAT CAAT G CAAAAGGT C CT GAT T T T GT T C CAGCAAAACACAGT GCAAT GT T CT C A 4992 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I II I I I I 

4 993 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 5052 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

| | I I I I I I II I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I M I I M I I 

5053 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 5112 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 3300 

| | M I I I I I I I I II I I I I I I I I I I M I I I II I I I I I M I I I I I I I I I II II I I I I I I I I I 
5113 TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 5172 

3301 GAAAGAAAGAGC AAT AAT AAT T AAT T CAC AC AC CAT AT GGAT T CT AT T T AT AAAT CAC C C 3360 

| | | | | | | | | | | I M I I I I I I I I I I I II I I II I I I I I M I I I I I I I I I I I I I M I I I I I I I 
5173 GAAAGAAAGAGCAATAATAATTAATT CACACACCATAT GGATT CTATTT ATAAAT CACCC 5232 

3361 ACAAACTT GT T CT TT AATT T CAT C CCAAT CACTT T T T CAGAGGC CT GT TAT C AT AGAAGT 342 0 

| | | | | | | I I I I I I I I I I I I I I I I I I M I I I II I I I II M I I I I I I I I I M I I I I II I I II 
5233 ACAAACT T GT T CT TT AAT T T CAT C CCAAT CACT T T T T CAGAGGC CT GT TAT CAT AGAAGT 5292 

3421 CAT T T T AGACT CT CAAT T T T AAAT T AAT TT T GAAT CACT AAT AT T T T CAC AGT T TAT T AA 3480 

| | | I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I II I I I I I M I I I II I I I I I I I I I I I 
5293 C ATT T T AGACT CT CAAT T T T AAAT T AAT TT T GAAT CACT AAT AT T T T CAC AGT T TAT T AA 5352 

3481 TAT AT T T AAT T T CT AT T TAAAT TT T AGAT TAT T T T TAT TAC CAT GT ACT GAATTT T T AC A 354 0 

M | I I I I I M I I I I I M M II I II I I I I I I I I I I I II I I I I I I M I I I M I I I I I I M I I 
5353 TAT AT T TAAT T T CT ATT TAAAT TT T AGAT TAT T T T TAT TAC CAT GT ACT GAATT T T TAC A 5412 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

| | | | | | M I II M I I I I I I I I I I I II I I I II II I I I I M I I I I I I I I I I I I I I M I I I I I 
5413 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 5472 

3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 3660 

| | | | | | | I II II I I I I I I I M I I I I M I I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I 
5473 T GAAACT ACACAC AAAAAGC AT ACT T GCAT TAT T T AT AAT AAAAT T G CAT T C AGT GGCT T 5532 

3661 TTTAAAAAAAAT GTTT GATT CAAAACTTT AACATACT GATAAGTAAGAAACAATT ATAAT 3720 

| | | I I I I I I I I I I I I M I M I II I II I I I I I I I I I I M I I M I I I I I I I I I I I II I I I I I 
5533 TTTAAAAAAAAT GTTT GATT CAAAACTTT AACATACT GATAAGTAAGAAACAATT ATAAT 5592 

3721 T T C T T T ACAT AC T CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT T CAAAAC AT GT 3780 



Db 


5593 


| IIIMIIIIMIM 1 1 1 1 i 1 I 11 1 1 1 1 1 1 1 MM 

TTCTT T AC AT ACT C AAAAC CAAGAT AGAAAAAGGT GCTAT C GT T CAACTT C AAAAC AT G I 




Qy 


3781 


T T C CT AGT AT T AAG G ACT T T AAT AT AGC AAC AGAC AAAAT TAT T GT T AAC AT GGAT GT T A 

1 I 1 1 M 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 M 1 1 1 M 1 1 1 M 1 1 i M M 

TT CCT AGT ATT AAGGACTT T AATATAGC AACAGACAAAATT AT T GT TAACAT GGAT Gl 1 A 


-3 0 fi U 


Db 


5653 




Qy 


3841 


CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 
|| || M | | I 1 1 1 1 II 1 1 | I I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

CAG CT CAAAAGAT T T AT AAAAGAT T T T AAC CT AT T T T CT C C CT T AT TAT C CACT GC T AA i 


o z? \j \j 


Db 


5713 


yj f 1 


Qy 


3901 


GT G GAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AG CTT AC AT AT GGC CAAAG GAAT AC A 

1 I 1 | | | | 1 I | I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 M II 1 1 II 1 II 1 II 1 1 M 

GT GGAT GT AT GTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 




Db 


5773 


R Q "3 0 


Qy 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTT^ATATAACAATGT 

| | | | | M II 1 1 1 1 M M M II 1 1 1 M 1 1 1 1 1 II 1 M 1 1 II 1 1 1 II M 1 1 M 1 1 1 1 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 


a no n 


Db 


5833 


D O j£ 


Qy 


4021 


AAAAAAT TAT AT AT CT GG GAGGAT TTTTTGGTT GC CT AAAGT GGCT AT AGT TACT GATT T 

I I I 1 1 1 M M M 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 M 1 1 II 1 1 M 1 1 II 1 1 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


a n q n 
4 U o u 


Db 


5893 




Qy 


4081 


T T TAT TAT GT AAG C AAAAC C AAT AAAAAT T T AAGT T T T T T T AAC AACT AC CTT AT T T TT C 

I I | | | | | I M 1 1 1 1 1 1 1 1 1 M 1 II 1 II 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 Ml 

T T T ATT AT GTAAGC AAAAC CAAT AAAAAT T T AAGT m T T T T T AACAACT AC CTT AT T T TT C 


a *\ a n 
4 1 4 u 


Db 


5953 


en i o 


Qy 


4141 


ACT GTACAGAC ACT AAT T CAT TAAAT ACT AAT T GATT GT T T AAAAGAAAT AT AAAT GT GA 

I | M | | | | | || M M M 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 

ACT GT AC AGACACT AAT T CAT TAAAT ACTAAT T GATT GT T T AAAAGAAAi Al AAA ibi ^a 


a o n n 
4 z u u 


Db 


6013 


6072 


Qy 


4201 


CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 

|| | M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M II 1 1 1 1 1 II 1 1 1 1 M 1 1 1 M II II 1 1 1 M 

C AAGT GGAC AT TAT T TAT GT TAAAT AT AC AAT TAT C AAGC AAGT AT GAAGTT AT T CAAT T 


4260 


Db 


6073 


6132 


Qy 


4261 


AAAATGCCACATTTCTGGTCTCTGGG 4286 

I 1 1 M 1 II 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 
AAAATGCCACATTTCTGGTCTCTGGG 6158 




Db 


6133 





RESULT 11 
ABZ96982 

ID ABZ96982 standard; DNA; 13611 BP. 
XX 

AC ABZ96982; 
XX 

DT 17-OCT-2003 (first entry) 
XX 

DE Human nucleic acid sequence. 
XX 

KW Human; antisense; lung dysfunction; nasal airway dysfunction; 

KW antiinflammatory steroid; ubiquinone; antiinflammatory; antiallergic; 

KW antiasthmatic; hypotensive; immunosuppressive; cytostatic; gene therapy 

KW antisense gene therapy; respiratory; lung; adenosine sensitivity; 

KW adenosine receptor; bronchodilation; bronchoconstriction; lung allergy; 

KW lung inflammation; respiratory disease; ds . 

XX 



OS Homo sapiens. 
XX 

PN WO200285308-A2. 
XX 

PD 31-OCT-2002. 
XX 

PF 23-APR-2002; 2002WO-US013135 . 
XX 

PR 24-APR-2001; 2001US-0286137P . 
XX 

PA (EPIG-) EPIGENESIS PHARM INC. 



XX 

PI 



Nyce JW, Li Y, Sandrasagra A, Katz E, Pabalan J, Aguilar D; 

PI Miller S, Tang L, Shahabuddin S; 
XX 

DR WPI; 2003-229219/22. 
XX 

PT Pharmaceutical composition for treating ailments associated with impaired 

PT respiration, has oligo(s) antisense to specific gene(s) or its 

PT corresponding RNAs, and glucocorticoid or non-glucocorticoid steroid or 

PT ubiquinone. 

XX 

PS Disclosure; SEQ ID NO 12224; 872pp; English. 
XX 

CC The invention relates to a novel pharmaceutical composition, which has a 

CC first active agent comprising an oligonucleotide antisense to the 

CC initiation codon, coding region, 5 ! or 3 1 end genomic flanking regions, 

CC 5' and 3' intron-exon junctions, or regions within 2-10 nucleotides of 

CC junctions of genes encoding a polypeptide associated with lung and/or 

CC nasal airway dysfunction and a second active agent comprising an 

CC antiinflammatory steroid and ubiquinone. A composition of the invention 

CC has antiinflammatory, antiallergic, antiasthmatic, hypotensive, 

CC immunosuppressive, and cytostatic activity. The composition may have a 

CC use in antisense gene therapy. The composition is useful for treating or 

CC preventing a respiratory, lung or malignant disease or condition, also 

CC for enhancing the prophylactic or therapeutic respiratory effect of an 

CC antiinflammatory steroid in a subject, for reducing or depleting levels 

CC of, or reducing sensitivity to adenosine, reducing levels of adenosine 

CC receptor, producing bronchodilation, increasing levels of ubiquinone or 

CC lung surfactant in a subject's tissue, or treating bronchoconstriction, 

CC lung inflammation, lung allergies, or a respiratory disease or condition. 

CC Note: The sequence data for this patent is not represented in the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences 

Sequence 13611 BP; 3676 A; 3007 C; 3056 G; 3868 T; 0 U; 4 Other; 

Query Match 99.6%; Score 4284.4; DB 7; Length 13611; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



XX 
SQ 



60 

Db 1873 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 1932 



Oy 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 
| | | | | | | I I I M I I I I I I M II I I I M I I I I II 



Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 
| | M II I I I I I I I I I I I I I M I I II I M I I I II I I I I I I I I M I I I II I I M M 



1933 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 1992 



121 AGGAT CAACAC AGT GGCT GAACACT GGGAAGGAACT GGTACTT GGAGTCT GGACAT CT GA 180 

| Ml I M I I I I M I I I MhM ! I _ 

1993 AGGAT CAACACAGTGGCT GAACACT GGGAAGGAACTGGTACTTGGAGTCTGGACAT CT GA 2052 

181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 24 0 

I I I I I l I I I I I I I I I 1 I I I I I II I I I I I I I II II I I I I I I I I II I 

2053 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 2112 

241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | || | | | | | | | | I I I I I II I I I II I I I I I I M I I I I I I I I I I I I I I I I 

2113 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 2172 

301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

| | | || | || I II I II I II I I I I I I I I I I I I I I I II I I I I I I I 

2173 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 2232 

361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 42 0 

| | I I I II I I I I I I I I II I I I I I I , I I I I I I I I I I I 

2233 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 2292 

421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 48 0 

| | | | | | | I I I I I I I I I I I I I I I II I I I M M I 

2293 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 2352 

4 81 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 54 0 



| | | | I I II I I I I I I M I M I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I 

:gcaccatctcccctcccccgtgccaaggacccatcgagatcaaggagactttcaaa 

lTcaacacggttgtgtcctgccttgtgttcgtgctggggatcatcgggaactccaca 
MM I I I I M I M I I I I I I I I I I I I M I I I M I II I I I I I I I I I I I I I I I I I I I M I I I I 



2413 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 2472 
601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

I I | | | | I I I I I I I M I I I I M I I I I I I I I I I I II I I I I I I I I I I I I 

2473 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 2532 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

| | | || | | | M I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2533 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 2592 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 



2773 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 2832 



Qy 


961 


AT AAT T AC GAT G GACT ACAAAGGAAGT T AT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 

■ i i i i i i t i i i i i i i i i i i t i t i t i 

1 1 1 1 I I I I I I I I I 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M M 

ATAATTAC GAT GGACTACAAAGGAAGTTATCTGCGAAT CTGCTTGCTT CAT CCCGTTCAG 


1020 


Db 


2833 


2892 


Qy 


1021 


AAGACAGCT T T C AT GC AGT T T T ACAAG ACAGCAAAAGAT T G GT G G CT GT T CAGT T T CT AT 

| I M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 M 

AAGACAGCT T T CAT G C AGT T T T ACAAG ACAG CAAAAGAT TGGTGGCTGTT CAGT T T CT AT 


1080 


Db 


2893 


2952 


Qy 


1081 


TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACT7^ATGACCTGTGAAATGTTG 

| | | | | | M | I I I 1 1 | I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 
TTCTGCTT GC C AT T G G C CAT CACT GCAT T T T T T TAT AC ACT AAT GAC C T GT GAAAT GT T G 


1140 


Db 


2953 


3012 


Qy 


1141 


AGAAAGAAAAGT G GC AT G C AGAT T GCT T T AAAT GAT C AC CT AAAG C AGAGAC G GGAAGT G 

| | | | M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 


1200 


Db 


3013 


3072 


Qy 


1201 


GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

| | | | | || | | | | | I I I I 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 II 
GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 


1260 


Db 


3073 


3132 


Qy 


1261 


AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

| | | I 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AGC AGGAT T CT GAAGCT CACT CT T TAT AAT C AGAAT GAT CC CAAT AGAT GT GAACT T T T G 


1320 


Db 


3133 


3192 


Qy 


1321 


AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

| M 1 1 M II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 
AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 


1380 


Db 


3193 


3252 


Qy 


1381 


AAC CCAAT T G C T CT GT AT T T GGT G AGCAAAAGAT T CAAAAAC T GCT T T AAGT CAT GCT T A 

| | | M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 


1440 


Db 


3253 


3312 


Qy 


1441 


TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| || | | | || | | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 
T GC T GCT GGT GC CAGT CAT TTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGT GCT T A 


1500 


Db 


3313 


3372 


Qy 


1501 


AAGT T CAAAG CT AAT GAT C ACGGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGCT CA 

|| | | | | M I I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 

AAGT T CAAAG C T AAT GAT C ACGGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGC T C A 


1560 


Db 


3373 


3432 


Qy 


1561 


TCTTGAAAGAAG7VACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

| | | M | | | | 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 


1620 


Db 


3433 


3492 


Qy 


1621 


AACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC ACAGC AC AC TAT 

| | | | M | I 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

AACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC ACAGC ACACT AT 


1680 


Db 


3493 


3552 


Ov 


1681 


T AAAAT AT T AAGT GT AAT T ATT T T AAC ACT C ACAGCT ACAT AT GACAT TT T AT GAGC T GT 

I | | | | | | M | || | 1 1 I 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

T AAAAT AT T AAGT GT AAT T ATT T T AAC ACT C ACAGCT ACAT AT GACAT TT TAT GAGC T GT 


1740 


Db 


3553 


3612 


Qy 


1741 


TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 

| | | | | | | | | | | || I I | || I I 1 1 1 1 II 1 1 1 1 1 II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
T T AC GGC AT GGAAAGAAAAT CAGT G G GAAT T AAGAAAG CCTCGTCGT GAAAGC ACT T AAT 


1800 


Db 


3613 


3672 



1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 18 60 

| | | | | | | | | | 1 I I I I I I 1 I I I I I I I I I I I I I I I I I I I 1 I i I I I I I I I I I I M I I I 

3673 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 3732 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

M I I I I I I I 1 I I I I I I I I M I I I I I 

3733 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 3792 

1921 AAT C AAT G G GACT C T GAT AT AAAG GAAGAAT AAGT CACT GT AAAAC AGAAC T T T T AAAT G 198 0 

| M I I I II I I I I I I I I I M I I M I I I I II I I I II I I I I I I I I I I M I I I 

3793 AAT CAAT G GGACT CT GAT AT AAAGGAAGAAT AAGT CAC T GT AAAAC AGAAC T T T T AAAT G 3852 

1981 AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 204 0 

| | | M | | M | I I I I I I I I I I I I I I I II I I I II I I I I I I I I I M I M M M I I I I I I I I I I 
3853 AAGCT T AAAT TACT CAATT TAAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 3912 

2041 TAT CAC ACT AT TAT CAGAT T GT AAT TAG AT G CAAAT GAGAGAGC AGT T T AGT T GTT GC AT 2100 

| | | | || | I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I II I I 

3913 TAT CAC ACT AT TAT CAGAT T GT AAT T AGAT GC AAAT GAGAGAGC AGT T T AGT T GT T G CAT 3972 

2101 T TT T C G GAC AC T G GAAACAT T T AAAT GAT C AG GAGGGAGT AACAGAAAGAG CAAGGCT GT 2160 

| | | | | || | | M I I I I I I I I I II I I I I I II I I I II I I I I I I M I II II I I M I I I I I I II I 
3973 T TT T C GGAC AC T GGAAACAT T T AAAT GAT C AG GAGGGAGT AACAGAAAGAGCAAGGCT GT 4032 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

| | | | | | | | I M II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I 
4033 T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT CAG CAT T CT GCAAT AT GT AAC 4092 

2221 CAAC AT GT C ACAAACAAGCAG CAT GT AAC AGACT G GC ACAT GT GC C AGCT GAAT T T AAAA 2280 

| | | I I I I I I M I I I M I I II I I I I M I I I I M I I I I I I I I I I I I I I I I I I I M I I I I M I 

4093 CAAC AT GT C ACAAACAAGCAG CAT GT AACAGACT G GC ACAT GT GC C AGCT GAAT T T AAAA 4152 

2281 TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC ATT C AGT T AAGAT CAAAC CT C A 234 0 

| | | | | | I I M II I I I I II I I I I I I M I I I I I M I I I I M I I I I I I I I I I I I I I M I I I II 
4153 TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT T AAGAT C AAAC CT C A 4212 

2341 CAAAG AG AAAT AGAAT GT T T GAAAG G C T AT C C C AAAAGAC x T T T T T GAAT C T GT CAT T C A 2400 

| | | | | | M | M I II I I I I I I I I I I M I I I M I II I I I I I I I I I M I I I I I I I I I I 

4213 CAAAGAGAAAT AGAAT GT T T GAAAG G CT AT C C CAAAAGACT T T T T T GAAT CT GT CAT T C A 4272 

2401 CAT AC C CT GT GAAGAC AAT ACT AT C T ACAAT T T T T T CAG GAT TAT T AAAAT CT T CT T T T T 2460 

| M | | | | | | || I || I I I I I M I I II I I I I I I I I I I I I I I I I I M II I I I I I I I I I 

4273 CAT AC C CT GT GAAGACAAT ACTAT C T ACAAT T T T T T CAG GAT TAT T AAAAT CT T CT T T T T 4332 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTA/^ATACTTACCTACATACA 2520 

M | | M || | | | || I I I I I I I I I I I II I I I I M I I II I I I M M I I I I M I M I I I I I I I I 
4333 TCACTATCGTAGCTTAAACTCT GTT TGGTTTTGTCATCTGT AAAT ACTTACCTACATACA 4 392 

2521 CT GCAT GT AGAT GAT T AAAT GAGGGC AGGC CCT GT GCT CATAGCTTT ACGAT GGAGAGAT 2580 

| | | | | || | || | || I I I II II I I I I I I I I I I I I I II I I I M I II I I I I M I M I I I M I II 
4393 CT G CAT GT AGAT GAT T AAAT GAGGGC AG GC C CT GT GCT CAT AG CT T T AC GAT GGAGAGAT 4452 

25 81 GCCAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GACAAAGGG 2 640 

|| | | I M I I I I I I I I I I I I I I I M I I II II I I I I I I I M I I I II 

44 53 GCCAGT GACCT CAT AAT AAAGACTGT GAACT GCCT GGT GC AGT GT CCACAT GACAAAGGG 4 512 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 



4513 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 4572 

27 01 G CT AT AGT TAAAAT ACT AT T T T T CAAAAT C AT AC AG ATT AGT AC AT T T AAC AGC T AC CT G 27 60 

| I | | | | | | | | I I I M I I I I I I I I I I I II I I I Mill M I I I II I I I I I 

4573 GCTATAGTTAAAATACTATTTTT CAAAAT CAT ACAGATTAGTACATTTAACAGCTACCT G 4632 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 2820 

| | I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I II I I II I I I I I I I I I I I I I M II 
4633 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 4692 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 288 0 

| | | | || | | | I I I I I I I I I I I I I I I M II I I M I I I M I M I I I I I I I I I I M I II I II I I 
4693 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 4752 

28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 

| I I I I I I I I I I II II I I I I I I I I I I I M I II I I I I II I I I I I M I I 

4753 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 4 812 

2 941 G GGAT G AGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 3000 

| | M II II II I I I I I M I I I I I M I I M I I I I M II II I I I I 

4813 GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAG GT G 4872 

3001 G G GT T GGAGGAAAC C CAT GG GGAC AGAT T C C CAT T CTT AGC CT AAC GT T C GT CAT T G C CT 3060 

| | | | | | | | || II I I I I I I I I I I I I I I I I I M II I I I M II I I I I I I M I I I I I I 

4873 G G GT T GGAGGAAAC C CAT GGGGAC AGAT T C C CAT T CT T AG C CT AAC GT T C GT CAT T GCC T 4932 

3061 C GT CAC AT CAAT GCAAAAGGT C CT GAT T T T GT T C C AGCAAAACAC AGT GCAAT GT T CT C A 3120 

| M | | | | | || || I I I I I I II I I I I I I I II I I I II 1 I M I I I I I I I I I M I II M II I I I I 
4933 C GT CAC AT CAAT GCAAAAG GT C CT GAT T T T GT T C C AGCAAAACAC AGT GCAAT GT T CT C A 4992 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTT7^CTCGGTCTTAAAATATGCCCAA 318 0 

|| | | | | | | | | I I I I I II I II I I I I II I I I I I I I I I I I I I I M I I I I I I I II M M I I I I I 
4993 GAGT GACT TT C GAAAT AAAT T GGG CC CAAGAGCT T TAACT C GGT CT TAAAAT AT GCC CAA 5052 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 324 0 

| | | || | | | | I I I I II II I I I I I II I I I I II I I I I I I I I I I I I I I M I I I I M I I I I I I I I 
5053 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 5112 

3241 TTGTTTTCT GT CAAT AT T GAAT GT GAT G GT AC AGT AAAC C AAAAC C C AAC AAT GT GGC C A 3300 

| | | | I I II I I I I I I I M M M I I I II I I I I I I I M I I I M 

5113 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGT AAAC CAAAAC C CAACAAT GT GGC C A 5172 

3301 GAAAGAAAGAGCAATAATAATT AATT CACACACCAT AT GGATT CTATTT ATAAAT CACCC 3360 

| | | || || I I I I I I I 1 I I II I I I II I I I I M I M I I I I I M I I II I I I I I I I I I I I I I I M 
5173 GAAAGAAAGAGCAAT AAT AATTAATT CACACACCAT AT GGAT T CTATTT ATAAAT CAC CC 5232 

3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACTTT T T CAGAG G C CT GT TAT CAT AGAAGT 342 0 

| | | | | | | | | | | I I I I I I I I I I I I I I I I II I i I I I II I I I I I I M I I I M I I M I I I I I I I 
5233 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACTTT T T CAGAG GC CT GT TAT CAT AGAAGT 5292 

3421 CAT T TT AGACT C T CAAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T CAC AGT T TAT TAA 34 80 

| || | | | | | | || | | || I I I I I I I I I II I II I I I I I I I I I M I M I I I I M I I I M I I I I I I 
5293 CAT T T T AGACT C T CAAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T CAC AGT T TAT TAA 5352 

3481 TAT AT T T AAT T T C TAT T T AAAT T T TAG AT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 3540 
| || | | | | | || | | | | | I I I I I I I I I I I I I I I I I I II I I I I I I M I I M I I I I I II I I I I I I 



5353 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 5412 

3600 
5472 



3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

M M I I I I M I I I I I M I I I I I I M I M I I I I I M I I M II I I I I I I I I 

5413 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



3601 T GAAACT ACACACAAAAAGCAT ACT T GCATT ATTT ATAAT AAAAT T GCATTCAGT GGCTT 

| | | | | | | M | || I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I 

5473 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AAC AT ACT G AT AAGT AAG AAAC AAT TAT AAT 

| IN I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I 

5533 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AAC AT ACT GAT AAGT AAG AAAC AAT TAT AAT 

3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

| I M I I I I II I I I I I I I I MINIM MINIMI MINI I MUM I I HIM 

5593 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 5652 
3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 3840 



3660 
5532 
3720 
5592 
3780 



6013 ACT GT AC AGACACT AATT CAT T AAAT ACT AAT T GATT GT TT AAAAGAAAT AT AAAT GT GA 6072 
CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 4260 



4201 

Ml II I I I I II II I I I II I I II I I I N M M I m I M M I^N'N 

6073 " — 



CAAGTGGACATTATTTATGTTAAATATACAATTATCAAGCAAGTATGAAGTTATTCAATT 6132 



4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

| II I I M I I I I I I I I I I II I I I M II 
6133 AAAATGCCACATTTCTGGTCTCTGGG 6158 



12 



Human adenosine receptor related polynucleotide 2nd SEQ ID NO: 40. 

Human; adenosine receptor; low adenosine antisense oligonucleotide; 
phosphorothioate; impaired respiration; inflammation; allergy; 
allergic disease; bronchoconstriction; inhibitor; antiinflammatory; 
antiallergic; antiasthmatic; cytostatic; analgesic; impaired airway; 
lung disease; ischaemic condition; pulmonary vasoconstriction; asthma; 
respiratory distress syndrome; pain; cystic fibrosis; emphysema; 
pulmonary hypertension; chronic obstructive pulmonary disease; COPD; 
cancer; leukaemia; lymphoma; carcinoma; metastasis; ss. 



ID AAA35166 standard; DNA; 13612 BP. 
XX 

AC AAA35166; 
XX 

DT 28-JUL-2000 (first entry) 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
XX 

OS Homo sapiens. 
XX 

PN WO200009525-A2. 
XX 

PD 24-FEB-2000. 
XX 

PF 03-AUG-1999; 99WO-US017712 . 
XX 

PR 03-AUG-1998; 98US-0095212P . 
XX 

PA (UYEC-) UNIV EAST CAROLINA. 
XX 

PI Nyce JW; 
XX 

DR WPI; 2000-205971/18. 
XX 

PT New antisense oligonucleotides useful for treating e.g. pulmonary 
PT vasoconstruction, inflammation, allergies, asthma, hypertension, 
PT bronchitis, emphysema, respiratory distress syndrome, ischemia or 
PT cancers. 
XX 

PS Disclosure; Page 1194-1197; 1343pp; English. 
XX 

CC The present invention describes a new composition comprising an antisense 

CC oligonucleotide (ON) with low adenosine (up to 15%), which targets 

CC nucleic acids involved in bronchoconstriction, allergies, and/or 

CC inflammation. The ON can have antiinflammatory, antiallergic, 

CC antiasthmatic, cytostatic and analgesic activities. The compositions are 

CC useful for the treatment of diseases associated with inflammation, 

CC impaired airways, including lung disease and diseases whose secondary 

CC effects afflict the lungs of a subject. They can be used for treating 

CC e.g. ischaemic conditions, pulmonary vasoconstriction, allergies, asthma, 

CC impeded respiration, respiratory distress syndrome, pain, cystic 

CC fibrosis, pulmonary hypertension, emphysema, chronic obstructive 

CC pulmonary disease (COPD), and cancers such as leukaemias, lymphomas, 

CC carcinomas, and cancers which may metastasise to the lungs, including 

CC breast and prostate cancer. The reduction of the adenosine content of the 

CC ONs reduces side effects. The A-containing ONs break down with the 

CC release of deoxyadenosine which activates adenosine receptors causing 

CC bronchoconstriction and inflammation. AAA32313 to AAA35312 represent the 



XX 
SQ 



CC nucleotide sequences given in the sequence listing from the present 

CC invention, which correspond to SEQ ID NO:l to 2815, and then the last 185 

CC sequences are also called SEQ ID N0:1 to 185, but the sequences differ 

CC from the previously named sequences. SEQ ID NO: 11 to 1680 (AAA32323 to 

CC AAA33992) are specifically claimed ONs from the present invention. N.B. 

CC Sequences given in the disclosure of the present invention do not match 

CC up with their corresponding SEQ ID NO: sequences given in the sequence 

CC listing 

Sequence 13612 BP; 3677 A; 3007 C; 3056 G; 3868 T; 0 U; 4 Other; 

Query Match 99.6%; Score 4284.4; DB 3; Length 13612; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

HI | | | | | I I I I I I M I I I I I I M I II I I I I I I I I M I I I I I I I M I I 

Db 1873 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 1932 

0v 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

| | | || | | M II I I I | | I I I I I M I I I I I I I M I I I I I M I II I I I I I I I I I I 

Db 1933 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 



Qy 



120 
1992 
180 



Ov 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 

I I | | | I IIIIIIIIIIIIIMIIIIMIMMIIIMM IIIIMIMM 

Db 1993 AGGAT CAAC AC AGT GGCT GAACACT G G G AAGGAACT G GT ACTT GGAGT CT GGAC AT CT GA 2052 



Ov 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

| | | | | | || | M I I I I I I | | I I I I M I I I I I I M II I I I I I M II I I I I I I I I I I 

Db 2053 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

Ov 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

| | | | | | | | M | | I I I I I I I M I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I 

Db 2113 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

Ov 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

I | || | | | | | | | | | I I I I I I I I I I I I M I II I I II II I I I M I M 

Db 2173 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

Ov 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

| | | | | | || | | | | | | I I I I I II | | I I I II I I M I I I I I I I I I I I I I I I I I I M 

Db 2233 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

0 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

| M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I 
Db 22 93 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 



240 
2112 
300 
2172 
360 
2232 
420 
2292 
480 
2352 
540 



O 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

| | | | | | | | | II I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I M M 

Db 2353 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 2412 

541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
| | || | | | | M I I I I I II I I I I I I I M M I I I I I M I I M I I I I II I I I I I I I 



Db 2413 



TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 



2472 



Qv 601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

|| | I I I I I I I I I I II II I I I I I I M I I I M I I I II I I I I I I I I I M I I M I I I I II I M I 



2473 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT G C GAAAC GGT C C CAAT AT CT T GAT C GC C 2532 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

II I I I I I I I I I II MM MM MUM 

2533 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 2592 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 

I M I I I I I M II M II I M I II M II II I M II II M M I I I I 

2593 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 2652 
781 AAAGC CT C C GT GG GAAT C ACT GT GCT GAGT CT AT GT GCT CT GAGT AT T GAC AGAT AT C GA 840 

1 | I 1 | I I I I I I I I I I I I I 1 I 1 I I I I I I I I M I I 1 I I t I I 1 I I I 1 t I M I M I 

2653 AAAGC CT C C GT GG GAAT C AC T GT GCT GAGT CT AT GT G C T CT GAGT AT T GACAGAT AT C GA 2712 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

I I I I I M I I I I I I I I I I M I I II II I M M I I I I I I M I I M M I I I I 

2713 GCTGTTGCTTCTTG GAGT AGAAT T AAAGGAAT T GG GGT T C CAAAAT G GAC AG CAGT AGAA 2772 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I | | I I I II I I M II M I I I II I I II M I I I M I I II I I I I M M M II 

2773 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 2832 

961 AT AAT T AC GAT G GACT AC AAAG GAAGT TAT CT GC GAAT CT GCT T GCT T CAT C C C GT T C AG 1020 

I | | | | I I I I I I M I II I II M II M II M M M I I I I I II I I II I 

2833 AT AAT T AC GAT GGACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 2892 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 108 0 

| || || | | | | || || I M I I II I I I II I I M I I M II I I II I I I I I I I I I M I I M I I I I I I 
2893 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 2952 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 

| | | || || | | | || | || I II II I I I II I I I I I I II II I I M II I I I I I I I I II I M M I I M 
2953 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTG7WVTGTTG 3 012 

1141 AGAAAGAAAAGT GGCAT GCAGATT GCTT TAAAT GAT CACCTAAAGCAGAGACGGGAAGT G 1200 

| | | | | | | | | | | | I I I I I I I I II I I I I I I I II II II I I I I M M M II I I I I I II M I I I I 
3 013 AGAAAGAAAAGT GGCAT GC AGAT T G CT T TAAAT GAT C AC CT AAAGC AGAGAC GGGAAGT G 3072 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

| | I I I I Ml I M MUM M I I I I II I I I II 

3073 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 3132 
12 61 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 

| M M M II II M II I II II I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I M I M I II 

3133 AGCAGGAT T CT GAAGC T CACT C T T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAAC T T T T G 3192 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

| | | | | M II I I I I M I I M I I I I I M M I I I I I I M I I I I I I I I I II I I I I I M I II M I 
3193 AGCT T T CT GT T GGT ATT GGACT AT ATT GGT AT CAACATGGCTT CACT GAAT TCCTGCATT 3252 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 144 0 

| | | | | M II I I I M I I I I M M I I II II I I M II M I I I I I M I M I I I I I II I I I I M I 
3253 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 3312 

1441 TGCTGCTGGTGCCAGTCATTTGAAGT^AAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

I | | | | | M I I I I I I II II M I I II II I II I I M M I M II I I M I I I I I M I I II M M I 

3313 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 3372 



Qy 1501 AAGT T CAAAGCTAAT GAT CACGGATAT GACAACTT CCGTT CC AGTAATAAATACAGCT CA 1560 

I I I I I I I I I I I I I I I I I II I I I I II I I I I 1 I M I M 

Db 3373 AAGTT CAAAGCTAAT GAT CACGGATAT GACAACTT C CGTT CCAGTAATAAATACAGCT CA 3432 

Qy 1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 162 0 

| | | | | I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M M I 1 I 

Db 3433 T CT T GAAAGAAGAACT AT T CACT GT AT T T CAT T T T CT T TAT AT T GGAC C GAAGT CAT T AA 3492 

Qy 1621 AACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACTAT GT ATT T GCACAGCACACT AT 1680 

| | | | M | I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I M I I I I I I I I I II I I I I I I I 
Db 34 93 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACT AT GTATTT GCACAGCACACT AT 3552 

Qy 1681 T AAAAT AT T AAGT GT AAT TAT T T T AACACT CACAG CT AC AT AT GACAT T T TAT GAG CT GT 1740 

| | | | | | | | | | I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I M M I I I I I I 
Db 3553 T AAAAT ATT AAGT GTAATTATTTT AACACT CACAGCTACAT AT GACATTTTAT GAGCT GT 3612 

Qy 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

| | | | M | I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I II M I I I 

Db 3613 TTAC GGCAT GGAAAGAAAAT CAGT GGGAATTAAGAAAGCCTC GT CGT GAAAGCACTT AAT 3672 

Qy 1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 1860 

| | | | | | | | | | I I I I I II I I I II I I I I I II I I II I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 3673 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 3732 

Q y 1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 192 0 

| | | | I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I M I I I I I I I I 

Db 3733 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 3792 

Qy 1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

| | | | | | | | | I I I M I I I II II II I I I I I I I I I I I I I I I M I II I I I I I I I I I I I M I I I I 
Db 3793 AAT CAAT GG GAC T CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAAC AGAACT TT TAAAT G 3852 

Qy 1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 204 0 

MINIM II I M II II II II M I II II II I I II I I I II I 

Db 3853 AAGCT TAAAT TACT CAAT T TAAAAT T T T AAAAT C C T T T AAAACAACT T T T CAATT AAT AT 3912 

Qy 2041 TAT C AC AC TAT TAT C AGAT T GT AAT TAG AT GC AAAT GAGAGAGC AGT T T AGT T GT T GC AT 2100 

| || || I I I I I I I I I I I I M I I I II I I I M I I I II I II I M M I I I I I I I I I I I I I I I M I 
Db 3913 TAT C AC AC TAT TAT C AGAT T GT AAT T AGAT GC AAAT GAGAGAG CAGT T T AGT T GT T G CAT 3972 

Qy 2101 T T T T C GGACACT GGAAAC AT T TAAAT GAT C AGGAGGGAGTAACAGAAAGAGCAAG GCT GT 2160 

I | | M I II II II I II I II II II II M M I M M I M I I I M I I I II M II I I I I I I I I M 
Db 3973 TTTT CGGACACT GGAAACATT TAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 4 032 

Qy 2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

| || II I II II II I I II II I I I I II I I I I M I I M I I M I II I I I I II I I I I I M M I I I I 
Db 4033 TTTT GAAAAT CAT T AC ACT T T C ACT AGAAGC C CAAAC CT C AGCAT T CT G CAAT AT GT AAC 4092 

Qy 2221 CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GC CAGCT GAAT TT AAAA 2280 

M I I II I I I I M M I M I II I II I I II I I II I I I I I I I I I M I M I I I I 

Db 4 093 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAA7\A 4152 

Qy 22 81 TAT AAT AC T T T T AAAAAGAAAAT T ATT AC AT C CT T T AC AT T CAGT T AAGAT C AAAC CT C A 2 340 

M II I I M I I M I II I I I I I M II M II II I I I II I I II I 

Db 4153 TAT AAT AC T T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T CAGT T AAGAT C AAAC CT C A 4212 



2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

| | | | | | | | | | | | | | I I M | | | | I I I | II I I I I I I I I II 1 I I I I I I I I I I I I I I 

4213 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 4272 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

IN | | | I I I i I M I I M 1 I I I I 1 I I I I I I I I 

4273 CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T T T C AG GATT AT T AAAAT CTTCTTTTT 4332 

2461 TCACTATCGTAGCTT7^AACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 

M I I I M M I I I I II I I I I I I I I i I I I I 1 I I I M I I I I I I 1 I I I II 1 I I I I I I I 

4333 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 4392 

2521 CT GC AT GT AGAT GAT T AAAT GAGGGC AGG C C CT GT GCT CAT AGCT T T AC GAT GG AGAGAT 2580 

| | | | | | | | | | | I I II M I I I I I I I I I I M I I t I I II li I I I I I I I I I I M I II M I I I II 
4393 CT GCAT GT AGAT GATT AAAT GAGGGCAGGCC CT GT GCT CATAGCTTTACGAT GGAGAGAT 4452 

2581 GC CAGT GAC CT C AT AATAAAGACT GT GAACT GC CT G GT GC AGT GT CC ACAT GAC AAAGGG 2640 

|| | | | | | | || | M I I I I I I I I I I I I I I I I I I I I I I I M I I II I I II I I M 

4453 GCCAGT GACCT CATAATAAAGACT GT GAACT GC CT GGT GCAGT GT CCACAT GACAAAGGG 4512 

2 641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 2700 

| M I I I I I I I I I M I I I I I M I I II I I I I I I I I I M I I I I I M I I I I I II I I I 

4513 GCAGGTAGCACCCTCTCTCACCCAT GCT GTGGTT AAAAT GGTTTCTAGCATATGT AT AAT 4572 

2701 GCT AT AGT T AAAAT ACT AT T TT T CAAAAT C ATACAGAT T AGT AC AT TT AAC AGCT AC CT G 2760 

| | | | | | | | | I I || I I I II I I I I I I I M II I I I I I I M I I I I I I I I M I I I I I I I I I I I I I 
4573 G CT AT AGTTAAAAT ACT AT T T T T CAAAAT CAT AC AGAT T AGT AC AT T T AAC AGCT AC CT G 4632 

2761 TAAAGCT T AT T ACTAAT T T T T GT ATTAT T T T T GTAAAT AGC CAAT AGAAAAGT TT GCTT G 2820 

| M I I II I I I I I I I I M I II I I I I I I I M I I I M II I M I I I II I I I I I I I I I I I I I I I I 
4633 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 4692 

2821 ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCT CTT 2880 

M | | | | I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
4 693 ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 4752 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 294 0 

| | | M I I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I 
4753 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 4812 

2941 G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC G GAAGAGAG AG GAAAT GAGGT G 3000 

| 1 | | | | | I I M I I I I I I I I I I I I I M M I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I 
4813 G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC G GAAGAGAG AGGAAAT GAGGT G 4872 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

| M | | | | | | I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II M I M I I 

4873 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 4932 

3061 C GT CACAT CAAT GCAAAAGGT C CT GAT T TT GT T C C AGCAAAACAC AGT GCAAT GT T CT CA 3120 

| | | | | M I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

4933 C GT CACAT CAAT GCAAAAGGT C CT GAT T TT GT T C C AGCAAAACAC AGT GCAAT GT T CT CA 4992 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTC GGT CTT AAAAT AT GCCCAA 3180 

M I I I I I I I I I I I I M I I I I I I I I I I II I I I I M I I I I I I I I I I 

4 993 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTC GGT CTT AAAAT AT GCCCAA 5052 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 324 0 



I I I I I I I I I I I M I I I I M I 1 MINIUM 

'TTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 
1 GT T T T C T GT CAAT AT T GAAT GT GAT G GT ACAGT AAAC CAAAAC C CAACAAT GT GGC C A 

I Mill Mill Ml" M I I 



5113 



TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GGC CA 5172 

3360 
5232 



3301 GAAAGAAAGAGCAATAAT AAT T AAT T C AC AC AC CAT AT GGAT T CT AT T T AT AAAT C AC C C 

mm 1 1 1 ii i ii i ii m ii m i_m Nil!!!!!!!!!!! 1 

5173 



G AAAGAAAGAG CAAT AAT AAT T AAT T C AC AC AC CAT AT GGAT T CT AT T TAT AAAT C AC C C 



3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

M I II M I I I I I M II I M I I I I II I M I I I I 1 I M M I I I I M I I I I M II II M I II I 

ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 



5233 
3421 



5292 
3480 



CAT T T T AGACT CT CAAT T T T AAAT T AAT T T T GAAT C ACT AAT AT T T T C AC AGT T TAT T AA 

MM I I I M I I I I I I M M I I M I I I I I M I II I I I I I M II I I I M 

5293 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 5352 



3481 
5353 
3541 



TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 

M M M M I I I I I I I I I I M I M M I I I M M I M II M I I I M I I I II I I I I I I 

TAT AT T T AAT T T CT AT TT AAAT T T TAG AT TAT T T T TAT T AC CAT GT ACT GAAT T TT T AC A 



3540 
5412 
3600 



TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

MM M M M II I I I M I I I I I I M M M I I I II I I I I I M I II I M I M 

5413 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 5472 



3601 T GAAACT ACAC ACAAAAAGCAT ACT T GC AT T AT T T AT AAT AAAAT T GCAT T C AGT GGCT T 3660 

M I M I I M M II I I I I I I I I I I I M M I I I M M I I I II M I M II I I M I 

5473 TGAAACTACACACAAAAAGCATACTTGCATTATTTAT7\ATAAAATTGCATTCAGTGGCTT 5532 
3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AACAT AC T GATAAGTAAGAAACAAT TAT AAT 3720 

M I II I II I M M I I I I I I I I M I I I M II M I I I I I M I I I M I M M I II M I I I I I I 

5533 T T T AAAAAAAAT GT T T GAT T CAAAACT T T AACAT ACT GATAAGTAAGAAACAAT TAT AAT 5592 
3721 T T C T T T AC AT ACT CAAAAC CAAGAT AGAAAAAG GT GCT AT C GT T CAACT T CAAAACAT GT 3780 

M I II I M M I I II I M I M I I I I I I I II I M I I M I I M I II M I II M 

5593 T T CT T T AC AT ACT CAAAACCAAG AT AGAAAAAG GT G CT AT C GT T CAACT T CAAAACAT GT 5652 
37 81 T T C CT AGT AT T AAGGACT T T AAT AT AGCAAC AGACAAAATT AT T GT T AACAT GGAT GT T A 384 0 

M M M II II I I I II II II II I M M I I I I I II M I I I II I I II I II I I I I M I I 

5653 T T C CT AGT AT T AAGGAC T T T AAT AT AG C AAC AGACAAAATT AT T GT T AACAT GGAT GT T A 5712 

3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 

Ml II II II II II M M I M II II I M M M I I I I I I I M I II II I I I I I 

5713 CAG C T C AAAAGAT T T AT AAAAGAT T T T AAC CT AT TT T CT C C CT T ATT AT C C ACT GCT AAT 5772 

3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 

Ml II II II I I I I I M I I II I M I I M I Mill M II II II 

GT G GAT GT AT GT T CAAACAC C T T T T AGT AT T GAT AG CT T AC AT AT GGC C AAAG GAAT AC A 58 32 



5773 



3961 GT T TAT AG CAAAAC AT G G GT AT GCT GT AG CT AACT T T AT AAAAGT GT AAT AT AAC AAT GT 4020 

M II II II I I I M M II II I II II I II I M II MUM 

5833 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 5892 



4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 4080 

M M I I I I II I I II I I I M I I I I I I I M M I I II 1 MMMIMMIMMMII 



Db 



5893 AAAAAAT TAT AT AT C T G GGAG GAT T T T T T GGT T GC CTAAAGT G G CT AT AGT TACT GAT T T 5952 



Qy 4081 TT T AT TAT GT AAGCAAAAC CAAT AAAAAT T T AAGT T T T T T T AACAACT AC C T TAT T T T T C 4140 

|| | | | M I I I I I I I I 1 I I I I I I II I I II I I I I I I I I I I I ! I I I I I I M I 

Db 5953 T T TAT T AT GT AAG C AAAAC CAAT AAAAAT T T AAGT T T T T T T AAC AAC T AC C T TAT T T T T C 6012 

Qy 4141 AC T GT AC AGACACT AAT T CAT TAAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 4200 

I I | I I || I I I I I I I I I I II I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I II I I I 
Db 6013 ACT GT ACAGAC AC T AAT T CAT TAAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 6072 

Qy 4201 CAAGT GGAC AT TAT T TAT GT TAAAT AT ACAAT TAT CAAGCAAGTAT GAAGT T ATT CAAT T 42 60 

I I | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I II I I I 
Db 6073 CAAGT G GAC AT TAT T TAT GT TAAAT AT AC AAT TAT C AAG CAAGT AT GAAGT TAT T CAAT T 6132 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 6133 AAAATGCCACATTTCTGGTCTCTGGG 6158 
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New combination comprising cDNAs that are differentially expressed in 


PT 
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PT 


disorders e.g., lung cancer, chronic obstructive pulmonary disease, 


PT 


emphysema or asthma. 


XX 




PS 


Claim 1; Page; 39pp; English. 


XX 




cc 


The invention relates to a combination comprising cDNAs or their 


cc 


complements that are differentially expressed in respiratory disorder. 



CC The combination is useful for preparing a composition for diagnosing or 

CC treating respiratory disorders e.g. lung cancer, chronic obstructive 

CC pulmonary disease, emphysema or asthma. The present sequence represents 

CC human cDNA differentially expressed during lung cancer 
XX 

SQ Sequence 4305 BP; 1341 A; 835 C; 816 G; 1311 T; 0 U; 2 Other; 

Query Match 97.7%; Score 4202.4; DB 8; Length 4305; 

Best Local Similarity 99.4%; Pred. No. 0; 

Matches 4280; Conservative 0; Mismatches 18; Indels 8; Gaps 6; 
Qy 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

| | | M | I I I I I I I I I II I I I I I I I I I I I I I M I M I I I M I I I I I I I I I I I I I M II I I I 

Db 1 GAGAC AT T C C GGT G G GGGAC T CT GG C C AGC C C GAGCAAC GT GGAT C CT GAGAG CACT C C C 60 

Qy 61 AGGTAGGCATTTGCCCCGGT GGGAC GCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AG GAT CAAC ACAGT GGCT GAAC ACT GG GAAGGAACT G GT ACT T GGAGT CT G GACAT C T G A 18 0 

| | | | | | | | M I I II I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I I M I II I I I 
Db 121 AG GAT CAAC ACAGT GGCT GAAC ACT GG GAAGGAACT GGTACTT GGAGT CT GGACAT CTGA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I I I I I I I I I M I I I II I I I I M I I I I I I I II Ill Ml 

Db 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 24 0 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | I I I I I I I I M I I M M I I I I M I I I I I I I I I I I I I M I I I I II I I I I I I I M I I I I I I 

Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Q y 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

| | | | | | M I I I || I I I I II I I I I I I I I I I I I I M M I I I I I I I I I I II II 

Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

M I II I I II I I I I I II I I M I I I I I I I I I I I M I I I I I I I M 

Db 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

| 1 | | | II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M 

Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

Q y 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 

| M || | | | I I I I I M I II I I I 1 I I I I I M I I I I I I I I I I M I M 1 I I I I I I I I I 

Db 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 54 0 

Q y 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGT^ACTCCACA 600 

| | | | | || M I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 60 0 

Qy 601 CT T CT GAGAAT TAT CT ACAAGAAC AAGT GCAT GC GAAAC GGT C C CAAT AT CT T GAT C GC C 660 

| | | I I I I II MINIM I I I M II I I I I I I II I I I I I I I I I I I N 

Db 601 CT T CT GAGAAT TAT CT ACAAGAAC AAGT GCAT GC GAAAC G GT C C CAAT AT CT T GAT C GC C 660 



Qy 



661 AG CT T GG CT CT GGGAGAC C T GCTGCACATCGT CAT T GACAT C C CT AT CAAT GTCTACAAG 720 

| | I I I I I I II I I I I II I M I I I I I I M II I M I II I I Mill 



661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 72 0 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 80 

| | | | | | | | | | | | | M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 CT G CT G GC AGAG GAC T G G C CAT T T G GAG CT GAGAT GT GT AAGC T G GT GC CT T T CAT AC AG 78 0 

781 AAAGC CT C C GT GG GAAT C ACT GT GCT GAGT C TAT GT G C T CT GAGT AT T GACAGAT AT C GA 84 0 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I II I I I I I 

781 AAAGCCT CC GT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGT ATT GACAGAT AT CGA 84 0 

841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

M I I 1 I I I II M I I I I I I I M M I I I II I I I I I I I I I I I I I 1 I I I I M I I I I I I I I I I I I 
841 GCTGTTGCTTCTTG GAGT AGAAT T AAAGGAAT TGGGGTTC CAAAAT GGACAGCAGT AGAA 900 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

| | I I I I I I I I | | | | I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

961 AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T CAG 1020 

| | | | | | | | | | I I I I I M II I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I M I I I I 
961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 

1021 AAGACAGCTT T CAT GC AGT T T T ACAAGACAGCAAAAGAT T GGT GGCT GT T C AGTT T C TAT 108 0 

|| || || | | | | | I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I M I M I I I I I 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

1081 TTCTGCTTGCCATTGGCCAT CACT GCATTTTTTTATACACTAAT GAC CTGTGAAATGTTG 1140 

| | | | | | || I I I I I I I I I I I I I II I I I I I II I I I I I I M II I I I I I I I I I I I I I M I I I I I 
1081 TTCTGCTTGC C ATT GG C CAT CACT GC AT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T G 1140 

1141 AGAAAGAAAAGT GGCAT GC AGAT T GCT T T AAAT GAT C AC CTAAAGCAGAGAC GGGAAGT G 1200 

| | | | | | | I I I M I I I I I I I II I I II I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II 
1141 AGAAAGAAAAGT GGCAT GC AGAT T GCT T TAAAT GAT C AC CTAAAGCAGAGAC GGGAAGT G 1200 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

|| | | | | M I II I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

1261 AGC AGGAT T CT GAAG CT C ACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T T T G 1320 

| | | | M | I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
1261 AGCAGGAT T C T GAAGCT CACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACTT T T G 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

13 81 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

| || | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II M I I 
1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGT T CAAAG CT AAT GAT CAC GGAT AT GACAAC T T C C GT T CC AGTAAT AAAT AC AGCT CA 1560 

I | | | | | | || || I I I I I I I II I I II I I I I I I I I I I I I I M I II I I I I I I M I I I I I I I I II 
1501 AAGTT CAAAGCTAAT GAT CACGGATAT GACAACTT CCGTT CCAGTAATAAAT ACAGCT CA 1560 



Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 


Qy 


1801 


Db 


1801 


Qy 


1861 


Db 


1861 


QV 

'X.J 


1921 


Db 


1921 


Qy 


1981 


Db 


1981 


Qy 


2041 


Db 


2041 


Qy 


2101 


Db 


2101 


Qy 


2161 


Db 


2161 


Qy 


2218 


Db 


2221 


Qy 


2274 


Db 


2281 


Qy 


2334 


Db 


2341 



T CT T GAAAGAAGAACT AT T C AC T GT AT T T CAT T T T CT T TAT AT T G GAC C GAAGT CAT T AA 162 0 

| M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T CT T GAAAGAAGAACT AT T C ACT GT AT T T CAT T TT CT T TAT AT T GGAC C GAAGT CAT T AA 162 0 

AACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC AC AGCAC AC TAT 168 0 
| | I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AACAAAAT GAAAC AT T T G C CAAAACAAAACAAAAAACT AT GT AT T T GC AC AGCAC ACT AT 1680 

T AAAAT AT T AAGT GTAAT TAT T T T AAC ACT C ACAGC T AC AT AT GACAT T TT AT G AG CT GT 174 0 

I M I I M I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T AAAAT AT TAAGT GTAAT TAT T T T AAC ACT C ACAGCTACAT AT GACAT T TT AT GAGC T GT 1740 

T T AC GG C AT GGAAAGAAAAT C AGT GG GAAT T AAGAAAGC C T C GT C GT GAAAGCACT T AAT 1800 

I | | | | | I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

T T AC GGCAT G GAAAGAAAAT C AGT G GGAAT TAAGAAAG C CT C GT C GT GAAAG CAC T T AAT 1800 

T T T T T AC AGT T AGC ACT T C AAC AT AG CT C T T AACAACT T C CAGGAT AT T C AC ACAAC ACT 1860 

I I I I I I I I I I I II I I I II II I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

T T T T T ACAGT T AGCACT T CAAC AT AGCT CT TAACAACT T C CAGGAT AT T CAC ACAAC AC T 1860 

T AGG CT TAAAAAT GAGCT C ACT C AGAAT T T CT AT T CT T T CTAAAAAGAGAT T TAT T T TT A 1920 

I | | | | | I I I I I I I I I I M I I I I I I I II I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I 

TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 192 0 

AAT CAAT G GGACT C T GAT AT AAAGGAAGAAT AAGT C ACT GT AAAACAGAACT T TTAAAT G 198 0 

I | | | | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I M I I I I I I I I 

AAT CAAT GG GACT CT GAT AT AAAG GAAGAAT AAGT CAC T GT AAAACAGAACT T TTAAAT G 1980 

AAGCT T AAAT TACT CAAT T T AAAAT T TT AAAAT C CT T T AAAACAACT T TT CAAT TAAT AT 2040 
M | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I 1 I I I I 

AAG CTTAAAT TACT CAAT T T AAAAT T T T AAAAT C CT T TAAAACAACTT T T CAAT TAAT AT 2040 

TAT CACACT AT TAT C AGAT T GTAAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GT T GC AT 2100 

I I I I I I I I I I I I I I II I I I I I M I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

TAT CACACT ATTAT CAGAT T GT AATTAGAT GCAAAT GAGAGAGCAGTTTAGTT GTT GCAT 2100 

T T TT C GGACACT G GAAACAT TTAAAT GAT CAGGAGGGAGT AACAGAAAGAG CAAGGCT GT 2160 

I | | | I I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I M I I II I I I I I I I I I I I I I I I I 

T TT T C GGAC ACT GGAAAC AT TTAAAT GAT CAGGAGGGAGTAAC AGAAAGAGCAAG GCT GT 2160 

T T T T GAAAAT CAT T AC ACT T T CAC — TAGAAGCCCAAACCTCAGCATT-CTGCAATATGT 2217 

I I I I I I I I II I I II I II I M I I I I I I I I I I M I I I I I I I I I I I I M I I I I M 

T T T T GAAAAT CATT AC AC T T T C AC CT AGAAGC C C CAAAC CT C AGCATT C C T GCAAT AT GT 2220 
AA- C CAAC AT GT C AC AAAC AAG C AG - - CAT GT AAC AGACT G GC AC AT GT G - C C AG CT G AA 2273 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I IN I 

AACCCAACATGTCACAAACAAGCCAGCCATGTAACAGACTGGCACATGTGCCCAGCTGAA 22 80 

T T T AAAAT AT AAT ACT T T TAAAAAG AAAAT TAT T AC AT C C T T T AC AT T C AGT T AAGAT C A 2333 
I I I I I I I I II II I I I I I I I I I I I I I II I I II I I I II I I I I I I I I I I M I I I I I I I I I I I I 
T T T AAAAT AT AAT AC T T T T AAAAAGAAAAT TAT T AC AT C CTT T AC AT T C AGT T AAG AT C A 2340 

AAC C T CAC AAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C C AAAAGACT T T T T T GAAT CT G 2393 

I I I I I I II I I II I I I I I II I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I M I I II 

AACCT CACAAAGAGAAAT AGAAT GT TT GAAAGGCTAT C C CAAAAGACT T TTTT GAAT CT G 24 00 



Qy 2394 T CAT T C AC AT AC C CT GT GAAGAC AAT ACT AT CT AC AAT T T T T T C AGGAT TAT T AAAAT C T 2453 

| | | | | I I I I I I I I I I I I II I I I I I I I I I I I i I I I I I I M I I I I I I I I I I I I I I II M II I 
Db 2401 T CAT T C AC AT AC C CT GT GAAGAC AAT ACT AT C T AC AAT T T T T T C AG GAT TAT T AAAAT CT 2460 

Qy 2454 TCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCT 2513 

| | | | | | | | I I I I I I I I II I I I II I II I I I I I I I I II I I I I I I I I I I I I I 

Db 2461 TCTTCTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCT 2520 

Qy 2514 ACAT ACACT GCAT GT AGAT GAT TAAAT GAGGGCAGGCC CT GT GCT CATAGCTTT AC GAT G 2573 

I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I 
Db 2521 ACAT AC ACT G CAT GT AGAT GAT TAAAT GAGGGCAGGC C CT GT GCT CAT AG CT T T AC GAT G 2580 

Qy 2574 GAGAGAT G C C AGT GAC CT CAT AAT AAAGAC T GT GAACT GC C T GGT GC AGT GT C C AC AT G A 2 633 

I | | I I I M I I I M II I I I II I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I M I I I 
Db 2581 GAGAGAT GC CAGT GACCT CATAATAAAGACT GT GAACT GCCT GGT GCAGT GT CCACAT GA 264 0 

Qy 2 634 CAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATAT 2 693 

| | | | | I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I M I I I M I I i I I I M I I I I I 
Db 2641 CAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATAT 27 00 

Qy 2694 GT ATAAT GCT AT AGT T AAAAT ACT AT T T T T CAAAAT C AT ACAGAT T AGT AC AT T T AAC AG 2753 

I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I M I I I I I II II I I I II I I I II I I I 
Db 2701 GTATAAT GCT ATAGTTAAAATACT ATTTTTCAAAAT CATACAGATT AGTACATTT AACAG 27 60 

Qy 2754 CTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGT 2813 

| | | M I I I II M I I I I I I I I I I I I M I I I I I I I I M I I I I I I M I I I I I I M I II I I I I I 
D b 27 61 CTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGT 2820 

Qy 2814 TTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGA 2873 

| | | | I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I 
Db 2821 T T G CT T GAC AT GGT GCT T T T CT T T CAT CT AGAG GCAAAACT G CT T T T T GAGAC C GT AAGA 2880 

Qy 2874 ACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGG 2933 

| | | | | M | | I I M I I I I I II I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 2881 ACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGG 2940 

Qy 2934 AT AGC T T GG GAT G AGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC G GAAGAGAGAGGAAA 2 993 

I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2941 ATAGCT T GGGAT GAGAT GT GTGT GAAAGTAT GTACAAGAGAAAACGGAAGAGAGAGGAAA 3000 

Qy 2994 T GAG GT GG G GT T GGAG GAAACC C AT G GG GAC AGAT T C C CAT T CT T AGC CT AAC GT T C GT C 3053 

|| | | | | | I I I I M I I I I I I II I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I M 
Db 3001 TGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTC 3060 

Qy 3054 AT T G C CT C GT C AC AT CAAT GCAAAAG GT C CT GAT T T T GT T C CAGCAAAAC AC AGT GCAAT 3113 

| | | | | | | I I II I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
D b 3061 AT TGCCTCGT C AC AT CAAT GCAAAAG GT C CT GATT T T GT T C CAGCAAAACACAGT GCAAT 3120 

Qy 3114 GTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATA 3173 

| || | M I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I M I I I I I II I I I M I I I I I I 
Db 3121 GT T C T C AG AGT GAC T T T C GAAAT AAAT T GGGC C CAAGAG C T T T AAC T C GGT CT T AAAAT A 3180 

Qy 3174 TGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGA7^ATAAGCT 3233 

|| | | | | | | | I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I II I M I I I I I I I I I I I I I I 
Db 3181 TGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCT 3240 

Qy 3234 AGT AAT GTTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT 32 93 



3241 AGT AAT GTTGTTTTCT GT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAACC C AAC AAT 330 0 

3294 GT GGCCAGAAAGAAAGAGCAATAATAATTAATTCACACAC CAT AT GGATT CT AT TTATAA 3353 

M | I I I I I I I I M I I I I I II I I I I I i I I I I M I I I I I M I I I I I I I I I I 

3301 GT GG C CAGAAAGAAAGAG CAAT AAT AAT T AAT T CAC ACAC CAT AT GGAT T CT AT T T AT AA 3360 

3354 AT CAC C C ACAAACT T GT T CT T T AAT T T CAT C C CAAT CACT T T T T C AGAGGC CT GT T AT CA 3413 

| | | M | I I I I I I I M I I I I I I I I I I I I I I I I M M M I I I I I I I I M II I I I II I I I I I I 
3361 AT CAC C CAC AAACT T GT T CT T T AAT T T CAT C C CAAT CACT T T T T C AGAGGC CT GT TAT C A 342 0 

3414 T AGAAGT CAT T T TAGACT CT CAAT T T T AAAT T AAT T T T GAAT CACT AAT AT T T T C ACAGT 3473 

M | I I I I I I I M M I I I 11 I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
3421 TAGAAGTCATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGT 3480 

3474 T TAT T AAT AT AT T T AAT T T CT AT T T AAATT T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT 3533 

I | | || I I I I I I I I I I II I I I M I I I I I I I I M I I II I I M I I I I I I I I M I I I I i 

3481 T TAT T AAT AT AT T T AAT T T CT AT T T AAATT T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT 354 0 

3534 TTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGC 3593 

| | | | | I I I I I I i 1 II I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I M I I I I I I I I I II I I I 
3541 T T T T AC AT C CT GAT AC CCTTTCCTTCTC CAT GT CAGT AT CAT GT T CT CT AAT TAT CT T GC 3600 

3594 CAAAT T T T GAAACT ACACAC AAAAAGCATACT T GC AT TAT TT AT AATAAAAT T G CAT T CA 3653 

| | M | | I || I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I 
3601 CAAAT T T T GAAACT ACACAC AAAAAGCATACT T GC AT TAT T T ATAAT AAAAT T GCAT T C A 3660 

3654 GTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAA 3713 

| | | M | I I I I I I I M I I I I I I I I I I M I I I M I I I I II II I II I I I I I I I I M I I I I I I 
3661 GTGGCTTTTT - AAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GAT AAGT AAGAAAC AA 3719 

3714 TTATAATTTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAA 3773 

| | | | | | | | | I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I M I II I I I I I I I I I M I 
3720 T TAT AAT TT CT T TAC AT ACT CAAAAC CAAGAT AGAAAAAGGT G CT AT C GTT CAACTT CAA 3779 

3774 AACAT GT T T C CT AGT AT T AAGGACT T T AAT AT AGCAACAGACAAAAT TAT T GTT AAC AT G 3833 

| | | | | M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
3780 AACAT GT T T C CT AGT AT T AAGGACT T T AAT AT AGCAACAGACAAAAT TAT T GT T AAC AT G 3839 

3834 GAT GT T AC AG CT CAAAAGATT T ATAAAAGAT TT T AAC CT ATT T T CT C C CT T AT TAT C CAC 3893 

M M I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I M I I 
3840 GAT GT T AC AGCT CAAAAGAT T T AT AAAAGAT T T T AAC CT AT T T T CT C C CT T AT TAT C CAC 38 99 

3894 T GCT AAT GT G GAT GT AT GT T C AAAC AC CT T T T AGT AT T GAT AG C T TAC AT AT G GC C AAAG 3953 

M M I II I I I I I I I I I I I I I I I I I I I I M I I I I M I II I I I I I I I I I I I I I I I M I I I I I 
3900 T GCT AAT GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AG CT TAC AT AT GG C CAAAG 3959 

3954 GAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATA 4013 

| | I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I M I 
3960 GAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATA 4019 

4014 ACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTA 4073 

| || | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I N I I 
4020 ACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTA 4079 

4 074 CT GAT T T T T TAT TAT GT AAG CAAAAC CAAT AAAAAT T T AAGT T T T T T T AAC AACT AC CT T 4133 
| M I I I II I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I M I I II I M 



Db 



408 0 CT GAT T T T T TAT TAT GT AAGCAAAAC CAAT AAAAAT T TAAGT T T T T T T AACAACTAC CT T 4139 



Qy 4134 AT T T T T C AC T GT ACAGACACTAAT T CAT T AAAT AC T AAT T GAT T GT T T AAAAGAAAT AT A 4193 

I I I II I M I I I I I I I I I I I M I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4140 AT T T T T C ACT GT ACAGACACTAAT T CAT T AAAT AC T AAT T GAT T GT T T AAAAGAAAT AT A 4199 

Qy 4194 AAT GT GACAAGT G GAC AT TAT T T AT GT T AAAT AT ACAAT TAT CAAGCAAGT AT GAAGTT A 4253 

I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I 
Db 4200 AAT GT GACAAGT GGAC AT TAT T T AT GT T AAAT AT ACAAT TAT CAAGCAAGT AT GAAGTT A 4259 

Qy 4254 TTCAATTAAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAA 4299 

I I I I I I I I I I I I I II I I I I I M II I I I I I II I 1 I I I 

Db 42 60 T T CAAT T AAAAT GC C AC AT T T C T GGT C AAAAAAAAAAAAAGNN AGA 4305 



RESULT 14 
ABK94410 

ID ABK94410 standard; DNA; 2972 BP. 
XX 

AC ABK94410; 
XX 

DT 27-AUG-2002 (first entry) 
XX 

DE DNA encoding endothelin receptor B (EDNRB) , exon 7. 
XX 

KW Endothelin; EDN; endothelin converting enzyme; ECE; EDNRB; 

KW endothelin receptor B; signaling system; cardiovascular disease; 

KW coronary heart disease; hypertension; atherosclerosis; angiogenesis ; 

KW fatty acid metabolism; diabetes; familial hypercholesterolaemia; 

KW forensic marker; transgenic animal; solid support; SNP; 

KW cardiovascular regulator; gene; ds ; single nucleotide polymorphism. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT variation replace ( 104 8 , A) 

FT /*tag= a 

FT /standard_name= "Single nucleotide polymorphism" 

FT variation replace ( 1658 , C) 

FT /*tag= b 

FT /standard_name= "Single nucleotide polymorphism" 

FT variation replace ( 1912 , T) 

FT /*tag= c 

FT /standard_name= "Single nucleotide polymorphism" 

FT variation replace (2130, T) 

FT /*tag= d 

FT /standard_name= "Single nucleotide polymorphism" 

XX 



PN WO200224747-A2. 
XX 

PD 28-MAR-2002. 
XX 

PF 31-AUG-2001; 2 001WO-EP010087 . 
XX 

PR 19-SEP-2000; 2000EP-00120123 . 
XX 

PA (EPID-) EPIDAUROS BIOTECHNOLOGIE AG. 



XX 

PI Brinkmann U, Hoffmeyer S; 
XX 

DR WPI; 2002-435060/46. 
XX 

PT Novel polynucleotide of the endothelin/endothelin converting 

PT enzyme/receptors of endothelin and endothelin converting enzyme signaling 

PT system associated with cardiovascular disease, useful for treating the 

PT disease. 

XX 

PS Claim 1; Page; 190pp; English. 
XX 

CC The invention describes a polynucleotide (I) of the endothelin 

CC (EDN) /endothelin converting enzyme (ECE) /receptors of EDN and ECE (EDNR) 

CC signaling system which is associated with a cardiovascular disease. (I), 

CC the gene encoding EDN, ECE or EDNR (II) or a vector (III) expressing (I) 

CC or (II) is useful for producing cells capable of expressing a molecular 

CC variant polypeptide which is associated with a cardiovascular disease. 

CC (II), (III), the EDN, ECE or EDNR polypeptide, or a cell expressing a 

CC molecular variant gene comprising (I) is useful for identifying and 

CC obtaining a pro-drug or drug capable of modulating the activity of a 

CC molecular variant of a polypeptide of the EDN/EDNR/ECE signaling system 

CC or its gene product, or for identifying and obtaining an inhibitor of the 

CC activity of a molecular variant of a polypeptide of the EDN/EDNR/ECE 

CC signaling system or its gene product. The isolated proteins and 

CC polynucleotides encoding them are useful for preparation of a 

CC pharmaceutical composition for treating a cardiovascular disease such as 

CC coronary heart disease, hypertension, atherosclerosis, or related to 

CC abnormal angiogenesis or fatty acid metabolism e.g. diabetes and familial 

CC hypercholesterolemia . The gene or a polynucleotide fragment of the 

CC EDN/ ECE/ EDNR signaling system are useful as forensic markers, for 

CC creating a transgenic animal and in creation of a solid support 

CC comprising polynucleotides, genes, vectors, polypeptides, antibodies or 

CC host cells of the invention. This sequence encodes a fragment of the 

CC cardioavscular regulator Endothelin receptor B (EDNRB) . Note: This 

CC sequence does not appear in the specification but has been obtained from 

CC GenBank using information given in the invention 

XX 

SQ Sequence 2972 BP; 1018 A; 499 C; 465 G; 990 T; 0 U; 0 Other; 

Query Match 66.4%; Score 2857; DB 6; Length 2972; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 2857; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1430 


AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 


1489 




1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


9 


AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 


68 


Qy 


1490 


AGT C GT GC T T AAAGT T CAAAGC TAAT GAT C AC G GAT AT GACAAC T T C CGT T C C AGT AAT A 


1549 




M II 1 1 1 II 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


69 


AGT C GT G CT T AAAGT T CAAAG CT AAT GAT C AC G GAT AT GACAACT T C CGT T C CAGT AAT A 


128 


Qy 


1550 


AAT AC AGCT CAT C T T GAAAGAAGAACT AT T CACT GT AT T T CAT T T T CTT T AT AT T G GAC C 


1609 






1 I 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


129 


AAT ACAG CT CAT CTT GAAAGAAGAAC TAT T CAC T GT AT T T CAT T T T CTT TAT AT T GG AC C 


188 



Qy 



1610 GAAGTCATTAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCA 1669 



I I I M I I I I I I I I I I I I I I I I I I II I I I II II I I I I I I I I I I I I 1 1 I I I I I I I I I I I II I 

Db 189 GAAGT CAT T AAAACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAAC TAT GT AT T T GCA 24 8 

Qy 167 0 CAGC AC AC TAT T AAAAT AT T AAGT GT AAT TAT T T T AAC ACT CAC AGCT AC AT AT GAC ATT 172 9 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I I I I 
Db 24 9 CAGCACACT ATTAAAATAT TAAGT GTAATTATTTTAACACT CACAGCTACAT AT GACATT 308 

Qy 1730 TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGA 1789 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I II II I I 
Db 309 TTATGAGCTGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGA 368 

Qy 1790 AAGC AC T T AAT T T T T T AC AGT T AGC ACT T CAACAT AG CT CT T AACAACT T C C AG GAT ATT 1849 

I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 369 AAGCACTTAATTTTTTACAGTTAGCACTT CAACAT AGCT CTTAACAACTTCCAGGAT ATT 42 8 

Qy 1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 1909 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I 
Db 42 9 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 4 88 

Qy 1910 AT T TAT T T T T AAAT CAAT GGGACT CT GAT AT AAAG GAAGAATAAGT C ACT GT AAAAC AGA 1969 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 48 9 AT T TAT T T T T AAAT CAAT GGGACT CT GAT AT AAAG GAAGAATAAGT C ACT GT AAAAC AGA 54 8 

Qy 197 0 ACT T T T AAAT GAAGCT TAAAT TACT CAAT T T AAAAT T T TAAAAT C CT T TAAAAC AACT TT 2 02 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I Ml I I I I I 
Db 54 9 ACT T T TAAAT GAAGCT TAAAT TACT CAAT T TAAAAT T T TAAAAT C CT T TAAAACAACT TT 608 

Qy 2030 T CAAT T AAT AT TAT CAC ACT AT TAT CAGAT T GTAAT T AGAT GCAAAT GAGAGAGCAGT TT 2089 

I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 60 9 T CAAT TAAT AT TAT CAC ACT AT TAT CAGAT T GTAAT T AGAT GCAAAT GAGAGAGCAGTTT 668 

Qy 2090 AGTTGTTGCATTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAG 2149 

M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 669 AGTTGTTGCATTTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAG 728 

Qy 2150 AG CAAGGCT GT T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT CAGCAT T CT G 2209 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I 
Db 72 9 AGCAAGGCT GT T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT CAGC AT T CT G 788 

Qy 2210 CAAT AT GT AAC CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACATGT GCCAGC 2269 

I I I I I I I I I I I I I II I I I I II I I I II I I I I II II I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 789 CAAT AT GTAAC CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGC 848 

Qy 227 0 T GAAT T TAAAAT AT AAT AC T T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T CAGT T AAG 2329 

I I II I I I I I I I I II I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 84 9 T GAAT T TAAAAT AT AAT AC T T TT AAAAAGAAAAT TAT T AC AT C CT T T AC AT T CAGT T AAG 908 

Qy 2330 AT CAAAC CT C ACAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACTT T T T T GAA 2389 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I 
Db 909 AT CAAAC CT C ACAAAGAGAAAT AGAAT GT T T GAAAGGC T AT C C CAAAAGACT T T T T T GAA 968 

Qy 2390 T CT GT CAT T CAC AT AC C CT GT GAAG ACAAT AC TAT CT ACAAT T TT T T C AG GAT TAT T AAA 2449 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I 
Db 969 T CT GT CAT T CAC AT AC C C T GT GAAGACAAT AC TAT CT AC AATT TTT T C AG GAT TAT T AAA 1028 



Qy 



2450 ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 2509 
I I I I I I I I I I I I I I II I I I I I II II I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I II I 



Db 


1029 


Qy 


2510 


Db 


1089 


Qy 


2570 


Db 


1149 


Qy 


2630 


Db 


1209 


Qy 


2690 


Db 


1269 


Qy 


2750 


Db 


1329 


Qy 


2810 


Db 


1389 


Qy 


2870 


Db 


1449 


Qy 


2930 


Db 


1509 


Qy 


2990 


Db 


1569 


Qy 


3050 


Db 


1629 


Qy 


3110 


Db 


1689 


Qy 


3170 


Db 


1749 


Qy 


3230 


Db 


1809 


Qy 


3290 


Db 


1869 



ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 1088 
AC C T AC AT ACAC T G CAT GT AGAT GAT TAAAT GAGG GC AGGC C C T GT GCT C AT AGCT T T AC 2569 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

ACCTACATACACTGCATGT AGAT GAT TAAAT GAGGGCAGGCCCTGT GCT CAT AGCTTTAC 114 8 
GAT G GAGAGAT GC CAGT GAC CT CAT AAT AAAGACT GT GAACT G C CT G GT GCAGT GT CC AC 2629 

MM I M It M M M M M M M M M M M I MM 

GAT GGAGAGAT G C CAGT GAC CT CAT AAT AAAGACT GT GAACT G C CT G GT GCAGT GT C C AC 12 08 
ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 2689 

M I I I M II II I I I I I I M I I M II M I II II II II I II I II I I II II II I II I M I M I 

ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 1268 
AT AT GT ATAAT GCTATAGTTAAAATACT ATTTTT CAAAAT CATACAGATT AGT ACATTT A 274 9 

M I I I II II I I I II M II I I I I I II I I I II II I I I I I M I II II I II I II I II II I I I II 

AT AT GT AT AAT GCT AT AGTTAAAAT ACT AT TT T T CAAAAT CAT AC AGAT T AGT ACAT T T A 132 8 
ACAGCT AC CT GTAAAGC T TAT TACT AAT T T T T GT AT TAT T T TT GT AAAT AGC CAAT AGAA 28 09 

M II II I M M I M I M I I I M I M II I M II I I II I I M II M I I M II I II II I I II I 

ACAG CT AC CT GT AAAGCT T AT T AC T AAT T T T T GT ATT AT T T T T GT AAAT AGC CAAT AGAA 138 8 

AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 28 69 

M II II I II I II M I I II II II II I I I I I M II I I I I II M I I II II II I I II I II I I M 
AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 14 48 

AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 292 9 

M II II II II I II I I I II I II II II I II II II I II I I I I I II I II II II II II I M I I I I 
AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 1508 

TAGGAT AGCTT GGGAT GAGAT GT GTGT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAG 298 9 

I I I II II II II II II II II II I I I I II I I I M I I I I I II I I II II II II II II II II II I 

TAGGAT AGCTTGGGATGAGATGTGTGTGAAAGTATGT ACAAGAGAAAACGGAAGAGAGAG 1568 

GAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTT 3049 

M I I I I I M I M I II I II I II I II I II I I II II I I I I II I I M M M I II II II I II II I 
GAAAT GAG GT G GG GTT GGAGGAAAC C CAT G G GGACAGAT T C C CAT T CTT AGC CT AAC GT T 162 8 

C GT CAT T GC CT C GT CAC AT CAAT G CAAAAGGT C CT GAT T T T GT T C C AGCAAAAC AC AGT G 3109 

II I II I I II I II I I II II I II II II M I II I M I I I I II II I II II II I I I II I I I I I M 

C GT CAT TGCCTCGT CAC AT CAAT GCAAAAG GT C CT GAT T T T GT T C C AGC AAAAC AC AGT G 168 8 
CAAT GT TCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGT CTT AA 3169 

I I I I II I II I II II I I II I II I II I I I I I I I I I I I I I II I I I I I I I I I I I I I II II M II 

CAAT GTT CTCAGAGTGACTTTC GAAAT AAATTGGGCCCAAGAGCTTTAACTCGGT CTT AA 174 8 

AAT AT G C C CAAAT T T T T ACT T TGTTTTTCTTT T AAT AGG CT GGG CC AC AT GT T GGAAAT A 322 9 
I I I I I I I I I II I II I II II II I II I I II II I II I I I I II I I I II I II II I II I I I I I M I 
AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 1808 

AGCT AGT AAT GTTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGTAAAC CAAAAC C CAA 3289 

| | | | | | | || II II I I I I I I I I I I M M I I M M I I I I II I I I M I M I I I I I I I I N I' I 
AG C T AGTAAT GT T GT T T T CT GT CAAT AT T GAAT GT GAT GGT ACAGTAAAC CAAAAC C CAA 18 68 

CAAT GT G G C CAGAAAGAAAGAG CAAT AAT AAT T AAT T C ACAC AC CAT AT G GAT T CT AT T T 3349 

M II II I II II II I I I I I I II I I II II II I I I II I I I II I I I I I I II I I II I I I II I M I 

CAAT GT GGC CAGAAAGAAAGAG CAAT AAT AAT T AAT T CAC AC AC CAT AT G GAT T CT AT T T 192 8 



3350 AT AAAT CAC C C ACAAACT T GT T CT T TAAT T T CAT C C CAAT C ACT T TT T CAGAGG C CT GT T 34 09 

IN | | | | I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

1929 ATAAATCACCCACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTT 1988 

3410 AT C AT AGAAGT CAT T T T AGACT CT CAAT T T T AAAT T AAT T T T GAAT CACTAAT AT T T T C A 34 69 

I I I I I I I I II II I I M I I I I I I 1 I I II I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I 
1989 AT CAT AGAAGT CAT T T T AGACT C T CAAT T T T AAAT T AAT T T T GAAT CACTAAT AT T T T C A 2048 

3470 C AGT TT AT TAAT AT AT T TAAT T T CT AT T T AAAT TT T AGAT TAT T T TT AT T AC CAT GT ACT 3529 

| | || | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I i I I I I I I I I I M I II I 
2049 CAGT TT AT TAAT AT AT TT AAT T T CT AT T TAAAT T T T AGAT TAT T T TT AT T AC CAT GT ACT 2108 

3530 GAAT TT T T ACAT C CT GAT AC CCTTTCCTTCTC CAT GT CAGT AT CAT GT T CT CT AAT TAT C 3589 

I | | || || I || I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M 
2109 GAAT TT T T AC AT C CT GAT AC CCTTTCCTTCTC CAT GT CAGT AT CAT GT T CT CTAAT TAT C 2168 

3590 TTGCCAAATTTTGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCA 3649 

II I I I || I II I I I I I I I I I I I I I I I I II I I I II I M I I I I I I I I I I M I I I I I I I I I I I I 
2169 T T GC CAAAT T T T GAAACT ACAC ACAAAAAGC AT ACT T GC AT TAT T T AT AAT AAAAT T GCA 222 8 

3650 TT C AGT GGCT T T T T AAAAAAAAT GT T T GATT CAAAACT T T AAC AT ACT GATAAGTAAGAA 3709 

I | | I I I I I I || M I I I I I I I I I I I I I I I I I I I I I II 1 I I I I I I I I I I I I I I I I I I I I I I I 
2229 T T CAGT GGCT T T T TAAAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GATAAGTAAGAA 22 8 8 

3710 ACAAT TAT AAT TT CT T T ACAT ACT CAAAAC CAAGAT AGAAAAAG GT GCT AT C GTT CAACT 37 69 

I | | | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I M I I I M I II I I I 
2289 ACAAT TAT AAT T T C T T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GTT CAACT 2348 

3770 T CAAAACAT GT TT C CT AGT AT T AAGGACT T TAAT AT AG CAACAGAC AAAAT T ATT GT TAA 3829 

| M M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I M I I I I I M I I I I I 
2349 T CAAAACAT GT T T C CT AGT AT T AAGGAC T T T AAT AT AGCAAC AGACAAAAT T ATT GT TAA 2408 

3830 CAT GGAT GTT ACAGCT CAAAAGAT T T AT AAAAGAT TT TAAC C TAT T T T CT C C CTT AT TAT 3889 

I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
2409 CAT GGAT GTT AC AGCT CAAAAGAT T TAT AAAAGAT T T T AAC CT AT TT T CT C C CTT AT TAT 24 68 

38 90 C CAC T GCT AAT GT GGAT GT AT GT T C AAACAC CT T T TAGT AT T GAT AG CT T AC AT AT GGC C 394 9 

| | | | | | M || I I I I I I I M II I I I II I M I I I I I I I I I I I I I I I I II I I I I I M I I I I I I 
2469 C CACT GCTAAT GT GGAT GT AT GT T CAAAC AC CTT T TAGT AT T GAT AGCT T ACAT AT GGC C 2528 

3950 AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 4 009 

I I I I I I I I II I M I I I I I I I I I I II I I I I I I I I I I I I M I I I I I II I I I I I I I I I I i I M 
252 9 AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 2588 

4010 TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATA 4069 

I | | | | | I I I I I I I I I M I I I I I I I II I I I I I I I II II II I I I I I I I I I I I II I I I I M I I 
2589 TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATA 2648 

4070 GT T ACT GAT T T TT T AT TAT GT AAG CAAAAC C AAT AAAAAT T T AAGT x T T T T T AACAACT A 4129 

II I I I I I I I I I I M I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I M I II I 

264 9 GT TACT GAT T T T T TAT TAT GTAAG CAAAAC CAAT AAAAAT T T AAGT T T T T T TAACAACT A 27 0 8 

4130 C CT T AT T T T T CACT GT AC AGAC ACT AAT T CAT TAAAT ACT AAT T GAT T GT T T AAAAGAAA 4189 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I M 

2709 C C T TAT T T T T CAC T GT AC AGAC AC TAAT T CAT TAAAT AC TAAT T GAT T GT T T AAAAGAAA 2768 



Qy 4190 T ATAAAT GT GACAAGT GGACATTATTTAT GTTAAATAT ACAATTAT CAAGCAAGT AT GAA 424 9 

I I I I M I I I I I I I I I I I I M I I I I M I I I I I I I I I I II I I I II I I I I I I I I I I I M M I I 
Db 2769 T ATAAAT GT GACAAGT GGACATTATTTAT GTTAAATAT ACAATTAT CAAGCAAGTAT GAA 2828 

Qy 4250 GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 4286 

I I I I I I I I I M I I I I I I M I I I I I I I I I I I I II I I I I 
Db 2829 GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 2865 



RESULT 15 
ABQ77402/c 

ID ABQ77402 standard; DNA; 183337 BP. 
XX 

AC ABQ77402; 
XX 

DT 10-MAY-2003 (first entry) 
XX 

DE Human EDNRB DNA. 
XX 

KW Human; EDNRB; vascular disease; cardiant; antiarteriosclerotic; stroke; 
KW cerebroprotective; gene therapy; coronary artery disease; ischaemia; 
KW myocardial infarction; peripheral vascular disease; pulmonary embolism; 
KW venous thromboembolism; forensic; paternity testing; GI12597038; gene; 
KW SNP; single nucleotide polymorphism; ds . 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualif iers 

FT variation replace ( 75672 , t ) 

FT /*tag= a 

FT /standard_name= "SNP" 

FT /note= "Single nucleotide polymorphism (ID G337a4) which 

FT does not change the EDNBR protein" 

XX 

PN WO2003016494-A2. 
XX 

PD 27-FEB-2003. 
XX 

PF 16-AUG-2002; 2002WO-US026343 . 
XX 

PR 16-AUG-2001; 2001US-0313097P . 
PR 05-OCT-2001; 2001US-03274 85P . 
PR 14-DEC-2001; 2001US-0002014 1 . 
XX 

PA (VITI-) VITIVITY INC. 
XX 

PI McCarthy J, Ableson A; 
XX 

DR WPI; 2003-300617/29. 
DR P-PSDB; ABG74670. 
XX 

PT Identifying a subject as a candidate for a particular course of therapy 

PT to treat a vascular disease or disorder, e.g. stroke, myocardial 

PT infarction or ischemia by determining the identity of the nucleotide 

PT present at specific positions. 

XX 

PS Claim 1; Fig 5; 568pp; English. 



XX 

CC This invention describes a novel method for identifying a subject as a 

CC candidate for a particular course of therapy to treat a vascular disease 

CC or disorder. The method comprises determining the identity of the 

CC nucleotide present at specific positions, or their complements, and 

CC identifying the subject as a candidate for a particular clinical course 

CC of therapy based on the identity of the nucleotide present in that 

CC specific position. The method can be used for identifying a subject who 

CC is a candidate for further diagnostic evaluation of a vascular disease or 

CC disorder and selecting a clinical course of therapy. The products of the 

CC invention have cardiant, antiarteriosclerotic and cerebroprotective 

CC activity and can be used for gene therapy. The methods disclosed are 

CC useful for treating a vascular disease, e.g. atherosclerosis, coronary 

CC artery disease, myocardial infarction, ischaemia, stroke, peripheral 

CC vascular diseases, venous thromboembolism and pulmonary embolism. The DNA 

CC sequences are useful as fingerprint for detecting different individuals 

CC within the same species applicable in forensic studies and paternity 

CC testing. This sequence encodes the human EDNBR gene represented in 

CC GI12597038, used to illustrate the method of the invention 

XX 

SQ Sequence 183337 BP; 56451 A; 33595 C; 34663 G; 58628 T; 0 U; 0 Other; 

Query Match 66.1%; Score 2841.8; DB 7; Length 183337; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2 854; Conservative 0; Mismatches 2; Indels 1; Gaps 1; 

1430 AGT CAT GCT TAT GCT GCTGGTGC CAGT CAT TTGAAGAAAAAC AGT CCTTGGAGGAAAAGC 1489 
I M I I I I I I I I I I I i I I I I I M I I I I I I I I I I I I I i I I I I I I I I I I M I I I I I I I M I I I 
72830 AGT CAT G CT T AT GCT GCT G GT GC CAGT CAT T T GAAGAAAAAC AGT C CT T G GAGGAAAAG C 72771 

1490 AGT C GT GCT T AAAGT T CAAAG CT AAT GAT C AC GGAT AT GACAAC T T C C GT T C C AGT AAT A 1549 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
7277 0 AGT C GT G CT T AAAGT T CAAAGCTAAT GAT C AC GGAT AT GACAACT T C C GT T C C AGTAAT A 72711 

1550 AAT ACAGCT CAT CT T GAAAGAAGAACT AT T CACT GT AT T T CAT T T T CT T TAT AT T GGAC C 1609 
I M M I 1 I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
72710 AAT ACAG CT CAT CT T GAAAGAAGAACT ATT CACT GT AT T T CAT T T T CT T TAT AT T G GAC C 72 651 

1610 GAAGT CAT T AAAAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAAC T AT G T AT T T G C A 1669 

I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I i I M I I I 
72650 GAAGT CAT T AAAAC AAAAT GAAAC AT T T GC C AAAAC AAAAC AAAAAACT AT GT AT T T GC A 72591 

167 0 C AGC AC ACT AT TAAAAT AT T AAGT GT AAT T AT TT TAAC ACT C ACAG CT AC AT AT GAC AT T 1729 
I || I I I I I I I I M I I I I I M I I I I I I M II I I II II II I I I I I I I I I I I I I I I I I I I I I I 
72590 C AG C ACACT AT TAAAAT AT T AAGT GT AAT TAT T T T AACACT C ACAGCT AC AT AT GAC AT T 72531 

1730 T TAT GAGCT GT T T AC GG C AT G GAAAGAAAAT CAGT GGGAAT T AAGAAAG CCTCGTCGT GA 178 9 
I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I M II I I I I 
72530 T TAT GAGCT GT T T AC G G CAT GGAAAGAAAAT CAGT GGGAAT T AAGAAAGC CT C GT C GT GA 72471 

1790 AAG C ACT T AAT T TT T T AC AGT TAG C ACT T C AAC AT AGC T CT T AACAACT T C C AGGAT AT T 1849 

I I || || I I I I I I I II I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
7247 0 AAGCACTTAATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATT 72411 

1850 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTA7 S AAAGAG 1909 

I | || | M I I I II I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
72410 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 72351 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 



Qy 


1910 


Db 


72350 


Qy 


1970 


Db 


72290 


Qy 


2030 


Db 


72230 


Qy 


2090 


Db 


72170 


Qy 


2150 


Db 


72110 


Qy 


2210 


Db 


72050 


Qy 


2270 


Db 


71990 


Qy 


2330 


Db 


71930 


Qy 


2390 


Db 


71870 


Qy 


2450 


Db 


71810 


Qy 


2510 


Db 


71750 


Qy 


2570 


Db 


71690 


Qy 


2630 


Db 


71630 


Qy 


2690 


Db 


71570 



AT T T ATT T T T AAAT CAAT G G GAC T CT GAT AT AAAGGAAGAAT AAGT CAC T GT AAAAC AGA 1969 

I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

ATTTATTTTTAAATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGA 72291 
AC T T T T AAAT GAAG C T T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACTT T 2029 

I I M I I I I I I I I I I I M I I I I I I I M M I I M I I I II I I I I I I I II I I I I I I M I I I I M 

ACT T TTAAAT GAAGCT T AAAT TACT CAAT TT AAAAT T T T AAAAT C CT T TAAAACAACTT T 72231 

T CAAT T AAT AT TAT CACACTATTATCAGATT GTAATTAGAT GCAAATGAGAGAGCAGTTT 208 9 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
T CAATTAATAT TAT CACACTATTAT CAGATT GTAATTAGAT GCAAAT GAGAGAGCAGTTT 72171 

AGT T GT T G CAT T T T T C GGACACT GGAAACAT T T AAAT GAT CAG GAGGGAGT AACAGAAAG 2149 

I || I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I M I 
AGT T GT T GCAT T T T T C GGACACT GGAAACAT T T AAAT GAT CAGGAGGGAGT AACAGAAAG 72111 

AGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTG 2209 

I I I I I I I I I I I I I I I I I I I I II 1 I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
AGCAAGGCTGTTTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTG 72051 

CAATATGTAACCAACATGTCACAAAC7^\GCAGCATGTAACAGACTGGCACATGTGCCAGC 22 69 

I II I I I I II I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I 

CAAT AT GT AAC CAAC AT GT CACAAACAAGC AGC AT GT AACAGAC T GG C AC AT GT G CC AGC 71991 
T G AAT T T AAAAT AT AAT ACTTT TAAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT T AAG 2329 

I I I I I II II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I M II I I I I I I I I 

T GAAT T TAAAAT ATAAT ACT T T TAAAAAGAAAAT TAT T ACAT C CT T T ACAT T C AGTTAAG 71931 

ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 2389 

I | | M || II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I i I I I I I I I 
ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 71871 

T CT GT CAT T C ACAT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T C AGGAT TAT T AAA 2449 

I | | | | I II I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I 

T C T GT CAT T CAC AT AC C C T GT GAAGACAAT ACT AT CT ACAAT T T T T T CAGGATT ATT AAA 71811 

ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 2509 

I I I I I I M I I II I I I I I I I I I I I I I I I I I I I II I I I I I M I M I I I I I I I I i I I I I I I I 
ATCTTCTTCTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 71751 

ACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTAC 2569 

I || | | | I | I I || I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
ACCTACATACACTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTAC 71691 

GAT GGAGAGAT GC CAGTGAC CT CAT AAT AAAGACTGT GAACT GCCT GGT GCAGT GT C CAC 2629 

I I I M I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

GATGGAGAGATGCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCAC 71631 

ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 2 68 9 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I II I I 
ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 71571 

AT AT GT AT AAT G C TAT AGT TAAAAT ACT AT T T T T CAAAAT C AT ACAGAT T AGT AC AT T T A 27 4 9 

I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I 

AT AT GT AT AAT GC TAT AGT TAAAAT ACT AT T T T T CAAAAT CAT AC AG AT T AGT AC AT T T A 71511 



2750 ACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAA 2809 

I | | | M | | | M | I I I I I I I I I MIIMIMMIIMIIMIIIMItlll 

71510 AC AG CT AC C T GT AAAGCT T AT TACT AAT T T TT GT AT TAT T T T T GT AAAT AGC CAAT AGAA 71451 

2810 AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 2869 

I 1 I I I I I I I I I I II I I I I MM 

71450 AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 71391 

2870 AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 2929 

| 1 | | | | | | | M I I II I I I M I I I II I I I I I 1 I I M I I M i I I I I I I M I I I I I 

71390 AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 71331 

2930 T AGGAT AGCTT GGGAT GAGAT GT GTGT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAG 2989 

M | | I | | I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

71330 TAG GAT AGCT T G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG 71271 

2990 GAAAT GAGGT GG GGT T G GAG GAAAC C CAT G G GGACAGAT T C C CAT T C T T AGC CTAACGT T 3049 

| | | | I I I M I I I II I I I I I I I I I I I II I I I I I I I IN I I I I I 

71270 GAAAT GAGGT GG GGT T G GAGGAAAC C CAT GG GGACAGAT T C C CAT T CT T AG C CT AAC GT T 71211 

3050 C GT CAT T GC CT C GT C ACAT CAAT GC AAAAGGT C CT GAT T TT GT T C C AGCAAAACAC AGT G 3109 

| | | | | | | | | M I I I I I I I II I I I I I I M I I I I M I I I I M I I I I M I I I I I I I I 

71210 C GT CAT T G C CT C GT C ACAT CAAT GC AAAAGGT C CT GAT T T T GT T C CAG CAAAACACAGT G 71151 

3110 CAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAA 3169 
| | | | | I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M 
71150 CAAT GT T C T CAGAGT GACT T T C GAAAT AAAT T GG GCC CAAGAG CT T T AACT C GGT CT T AA 71091 

317 0 AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 3229 

| | | | M | | | | | I I II I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I N I 
7109 0 AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 71031 

3230 AG CT AGT AAT GT T GT T TT CT GT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAA 328 9 
| | | | M I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I 
71030 AGCTAGTAATGTTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAA 70971 

3290 CAAT GT GGC C AGAAAGAAAGAGCAAT AAT AATT AATT C AC ACAC CAT AT GGAT T CT AT T T 3349 
| | | | | | | | | | I I I I I I || I I I I I I I I I I I I I I I I I I I M I I I I M II I I I I I I I I M I M 
70970 CAAT GT GGC CAGAAAGAAAGAGCAAT AAT AATT AATT C AC AC AC CAT AT GGAT T CT ATT T 70911 

3350 AT AAAT C AC C C ACAAACT T GT T C TT T AAT TT CAT C C CAAT CACT TT T T C AGAGG C CT GT T 34 09 
| | | | | | | M I I I I I I I I I I I I I I I I I II II I I I M I II II II I I I I I I I I I I I I II I I M 
70910 AT AAAT CAC C C AC AAACT T GTT C TT TAAT T T CAT C CC AAT CACT T T T T C AGAG GC CT GT T 70851 

3410 AT C AT AGAAGT CAT T T TAGACT C T CAAT T T T AAAT T AATT T T GAAT CACT AAT AT T T T CA 34 69 
| | | || | | | | | I I I I II I I I II I II II I I II II II I I M I I M I I I I I I I I I I I I I M I I I 
70850 AT CAT AGAAGT CAT T T TAGACT C T CAAT T T T AAAT TAATT T T GAAT CACT AAT AT T T T C A 70791 

3470 CAGT T TAT TAAT AT AT T TAAT T T CT AT T T AAATT T TAGAT TAT T T T TAT T AC CAT GT ACT 3529 

| | | | | I I I I II I I I I I I I I I I II II I I I I M I M I I I I I I I 

70790 C AGT T TAT TAAT AT AT T TAAT T T CT AT T T AAATT T TAGAT TAT T T T TAT T AC CAT GT ACT 70731 

3530 GAATTTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATC 3589 

MM I II M M I I I I M I I II I I I I I I I I M I I II I II I I I I I I I I 

70730 GAAT T T T T AC AT CCT GAT AC CT T TT C CT T CT C CAT GT CAGT AT CAT GT T C T C TAAT TAT C 7 0671 

3590 T T GC C AAAT TT T GAAACT AC AC ACAAAAAGCAT ACT T GC AT TAT T T AT AAT AAAAT T G CA 3649 



I I I I I I I I I I I M I I I MIMIIIIIMII I I M I I I I I I I I I I 

Db 70670 T T G C CAAAT T T T GAAACT AC AC ACAAAAAGC AT ACT T GC AT TAT T T AT AAT AAAAT T G C A 70611 

Qy 3650 T T CAGT GGCTTTTT AAAAAAAAT GT T T GAT T C AAAAC TT T AAC AT ACT GAT AAGT AAGAA 3709 

I t I I I t I I M I I II I I I I I I I I I I I I I I IIIIIIIIIM Illllll 

Db 70610 T T CAGT GGCTTTTT - AAAAAAAT GT T T GAT T CAAAACTT T AAC AT AC T GAT AAGT AAGAA 70552 

Qy 3710 ACAAT TAT AAT T T CT T T ACAT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GTT CAAC T 3769 

| | | | | | I I I I I I I I I I I I II I I I I M M I I I I I II I I I I I I I 

Db 70551 ACAAT TAT AAT T T C T T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT 70492 

Qy 3770 T CAAAACAT GT T T C CT AGT AT T AAGGACT T T AATAT AGCAACAGACAAAAT TAT T GT T AA 3829 

M | | | | | || M I I II I I I I I I I I M I I M I I I I I I I I M M I I I I I I I I II I I I I I I I I I 
Db 70491 T CAAAACAT GT T T C CT AGT AT T AAGGACT T T AAT AT AG CAACAGACAAAAT T ATT GT T AA 70432 

Qy 3830 CAT GGAT GT T ACAGCT CAAAAGAT T T AT AAAAGAT T T T AAC CT AT TT T CT C C CT T AT TAT 38 89 

M I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I M I I I I I I M I II I I II II I II I I I 
Db 7 0431 CAT G GAT GT T ACAGC T CAAAAGAT TT AT AAAAGAT T T T AAC CT AT TT T CT C C CT T AT TAT 70372 

Qy 3890 CCACTGCTAATGTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCC 3949 

Ml I I I II I I I I I I I I I I I IIIIIIIIIIIMMI INN M 

Db 70371 C CAC TGCT AAT GT GGAT GT AT GT T CAAAC AC CT TTT AGT AT T GAT AGC T T AC AT AT GG C C 70312 

Qy 3950 AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 4009 

| | | | | | | | | I I I I I I I I I I I M I II M I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I 
Db 70311 AAAGGAATACAGTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAA 70252 

Qy 4 010 TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATA 4069 

| | | | | | | | I I I I I I I I M I I I I II I I I I II I I II I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 70251 TATAACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCT.AAAGTGGCTATA 70192 

Qy 4070 GT TACT GATT T TT T AT TAT GT AAGCAAAAC CAAT AAAAATTT AAGTT T TTT T AACAACT A 4129 

| | | | | | I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I M I I I 
Db 7 0191 GT TACT GAT T T TT T AT TAT GT AAG CAAAAC CAAT AAAAATT T AAGT T T T T T T AAC AACT A 7 0132 

Qy 4130 C CTT ATT T T T C ACT GT ACAGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GTT T AAAAGAAA 4189 

| M | | | | I I I I I I II I I I II I I I II I I I I I I I M I M I I I I I I I I I I I I 

Db 70131 C CTT AT T T T T C ACT GT ACAGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAA 70072 

Qy 419 0 TAT AAAT GT GAC AAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT CAAGC AAGT AT GAA 42 49 

I I I I I t I 1 I 1 I I I I Ill M I ! i I M I i I I I I I I I I I i I I I ' I I I I i ! 

Db 7 0 071 TAT AAAT GT GAC AAGT G GAC AT TAT T TAT GT T AAAT AT AC AAT TAT CAAGC AAGT AT GAA 70012 

Qy 4250 GTT ATT CAATT AAAAT GCCACATTTCTGGTCTCTGGG 4286 

I I I I I I I II I I I I I I I I I I I II II I I I I I I I I I I I M 
Db 70011 GT TAT T CAATT AAAAT GCCACATTTCTGGTCTCTGGG 69975 
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RESULT 1 
US-08-121-446-3 

; Sequence 3, Application US/08121446 

; Patent No. 6313276 

; GENERAL INFORMATION: 

; APPLICANT: IMURA, HIROO 

APPLICANT: NAKAO, KAZUWA 
; APPLICANT: NAKANISHI, SHIGETADA 

TITLE OF INVENTION: A HUMAN ENDOTHELIN RECEPTOR 
; NUMBER OF SEQUENCES: 4 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: MORRISON & FOERSTER 
; STREET: 755 Page Mill Road 

CITY: Palo Alto 
; STATE: California 

COUNTRY: USA 
; ZIP : 94304-1018 

COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/121, 446 
FILING DATE: 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/911,684 
FILING DATE: 10-JUL-1992 
ATTORNEY/AGENT INFORMATION: 
NAME: CIOTTI, THOMAS E. 
REGISTRATION NUMBER: 21,013 
REFERENCE/DOCKET NUMBER: 29900-20324.00 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 813-5600 
TELEFAX: (415) 494-0792 
TELEX: 706141 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4301 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 238. .1566 
US-08-121-446-3 

Query Match 100.0%; Score 4301; DB 4; Length 4301; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4301; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


GAGAC AT T C C GGT GGGGGACT CT GGC CAGC C C GAG CAAC GT GGAT C CT GAGAG C ACT C C C 

IIIIIIIIMIIIIIIIIMMIMIIIIIIM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

GAGAC AT T C C G GT GGGGGACT CT GG C CAG C C C GAG CAAC GT GGAT C CT GAGAG C ACT C C C 


60 


Db 


1 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

IMIMl MINIMI 1 1 1 1 1 1 1 M 1 1 M 1 1 M II 1 M 1 1 1 1 1 1 1 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


61 


120 


Qy 


121 


AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACAT CT GA 

1 | | | | M II II M 1 M 1 1 1 M 1 M 1 II II II II 1 1 II M 1 II 1 1 1 II 1 1 1 1 1 M 1 1 M 1 1 

AGGAT CAACACAGT GGCT GAACACT GGGAAGGAACT GGT ACTT GGAGT CT GGACAT CT GA 


180 


Db 


121 


180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

1 | | | | | | | | M 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 M II M 1 1 1 M M M 1 II 1 II M 1 1 M 

AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


181 


240 


Qy 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

I M 1 1 1 M 1 II 1 1 M 1 II M M M M 1 1 1 1 1 1 1 1 M MINIUM 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 


Db 


241 


300 


Qy 


301 


TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

| | | | | N II II 1 II II N II II 1 1 II N 1 N II 1 II II II II 1 II II 1 II 1 N II 1 II M 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 


360 


Db 


301 


360 



Qy 


361 


AC C GC AG AGAT AAT GAC GCC AC C CAC TAAGACCT T AT GGC C CAAG G GT T C CAAC G C C AGT 

1 1 1 1 I | | | | I 1 1 1 M M 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 M 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M ! M 

AC C GCAG AGAT AAT GAC GC CAC C CAC TAAGACCT TAT GGC C CAAG G GT T C CAAC GC CAGT 


420 


Db 


361 


420 


Qy 


421 


CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

| | | | | || | | | | || I | | I I 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 
CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 


480 


Db 


421 


480 


Qy 


481 


C CAC GC AC CAT CTCCCCTCCCCCGTGC CAAG GAC C CAT C GAGAT CAAGGAGACT TT C AAA 

| | 1 | | | || | | | | | | | | 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 IN M 1 i 1 1 1 1 1 1 1 1 1 1 1 

C CAC GC AC CAT CTCCCCTCCCCCGTGC CAAG GAC C CAT C GAGAT CAAGGAGACTTT C AAA 


540 


Db 


481 


540 


Qy 


541 


TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 

| M | II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 
TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 


600 


Db 


541 


600 


Qy 


601 


CTTCTGAGT^VTTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 

IIIIMMIII 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 t 1 1 1 1 t 1 I r 1 MINI 

CT T CT GAGAAT TAT C T ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C GC C 


660 


Db 


601 


660 


Qy 


661 


AGCTT GGCT CT GGGAGACCT GCT GCACAT CGT CATT GACAT C C CTAT CAAT GT CT ACAAG 

| M I I I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 

AGCTT GGCT CT GGGAGACCT GCT GCACAT C GT CATT GACAT CC CTAT CAAT GT CT ACAAG 


720 


Db 


661 


720 


Qy 


721 


CT GCT GGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT ACAG 
| M | | | | M | | | 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

CT GCT GGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT AC AG 


780 


Db 


721 


780 


Qy 


781 


AAAG C CT C C GT GGGAAT C ACT GT GCT GAGT CTAT GT G CT CT GAGT AT T GAC AGAT AT C GA 

| | | | | MINI 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 M 

AAAGC CT C C GT GGGAAT C ACT GT GCT GAGT CTAT GT GCT CT GAGT AT T GACAGAT AT C GA 


840 


Db 


781 


840 


Qy 


841 


GCTGTTGCTTCTT GGAGT AGAAT TAAAGGAAT T G GGGTT C CAAAAT GGAC AG C AGT AGAA 

1 | | | | | I I I 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M M M II 1 1 1 1 1 1 

GCT GTT G CT T CT T GGAGT AGAAT T AAAG GAAT T G GGGTT C CAAAAT GGAC AGCAGT AGAA 


900 


Db 


841 


900 


Qy 


901 


ATT GTTTTGATTTGGGTGGTCTCTGTGGTTCT GGCT GTCCCTGAAGCCATAGGTTTT GAT 

| | | | | | | | | | | | M 1 1 1 1 1 M M 1 M 1 1 M 1 1 1 1 1 1 M 1 II 1 1 1 M 1 1 1 1 M 1 II 1 1 1 1 1 

ATT GTTTTGATTTGGGTGGTCTCTGTGGTTCT GGCT GTCCCTGAAGCCATAGGTTTT GAT 


960 


Db 


901 


960 


Qy 


961 


AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT G C GAAT CTGCTTGCTT CAT C C C GT T C AG 

I | | | | I I I I I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 

AT AAT T AC GAT G GACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 


1020 


Db 


961 


1020 


Qy 


1021 


AAGAC AGCT T T CAT G CAGT T T T ACAAGAC AGC AAAAGAT TGGTGGCT GTT CAGT T T CTAT 

| I | | | | 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 

AAGACAGCT T T CAT GC AGT T T T ACAAGAC AGC AAAAGAT T GGT G G CT GT T C AGT T T CTAT 


1080 


Db 


1021 


1080 


Ov 

wy 


1081 


TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

Ml 1 1 1 1 1 1 1 1 II 1 M 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 Ml 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 


1140 


Db 


1081 


1140 


Qy 


1141 


AGAAAGAAAAGT G G CAT G C AGAT T G CT T T AAAT GAT CAC C T AAAGC AGAGAC GGGAAGT G 

I | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 MIIMIIIMIIIIIIMIIIIMIIIIIIIIMI 

AGAAAGAAAAGT GGCAT GCAGATT GCTTTAAAT GAT CACCTAAAGCAGAGAC GGGAAGT G 


1200 


Db 


1141 


1200 



1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

I | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I M I I II II I I I I I I I i M I M II I I I II 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



AG C AGGAT T CT GAAGCT CACT C TT T AT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 

MIMIIIIIIIIIIIIMIIIIIIIIMIMIIIMI I I I I I I I I I I I I I I I 



1260 
1320 



1261 

1261 AG C AGGAT T CT GAAGCT CACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 1320 



1321 
1321 



AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

| | | | | | | | | 1 I I 1 t I I I I I I I 1 I t 1 1 I I t I 1 I ) I I I 1 t I I I I I I I I 1 I I I I I I I I 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1380 
1380 
1440 
1440 
1500 
1500 
1560 

1501 AAGT T ci*JVAGCT AAT GAT CAC GGAT AT GAC AACT T C C GT T C CAGT AAT AAAT AC AGCT C A 1560 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

| | | M I M II | | I I M I I I I I I I I I M M I I I I I M I M I M I I I 

1381 AAC C CAAT T GCT C T GT AT T T GGT GAGCAAAAGAT T CAAAAACT GCT T TAAGT CAT GCT T A 

14 41 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

MMIIIIIIIIIIIMIIIIIIIIMMIIIMMIIMI I 

1441 T GCT GCT GGT GC CAGT CATTT GAAGAAAAACAGT CCTT GGAGGAAAAGCAGTCGT GCTTA 
1501 AAGT T CAAAG CT AAT GAT CAC G GAT AT GACAACT T C CGT T C CAGT AAT AAAT AC AGCT C A 

II | | || | | M | I I I I I I II I II I II II I II I I M I J M I I M I I M iliiiil 



1561 T CT T GAAAGAAGAACT AT T CACT GTAT T T CAT T T T CT T TAT AT T G GACC GAAGT CAT T AA 

I I I I I | 1 | I I I I I II II I I I I I II M I I I I I M I I I I I M I I I I I I I I I 

1561 T CT T GAAAGAAGAACT AT T CACT GTAT T T CAT T T T CT T TAT AT T GGACC GAAGT CAT T AA 

1621 AACAAAAT GAAACATTT GCCAAAACAAAACAAAAAACTAT GT ATTT GCACAGCACACT AT 
| | || | | | | | | | M | I I I I I I I I II I I I II I I I I I M I I I 1 I I I M I I I I I I I I M I I I I I 
AACAAAAT GAAACATT T GCCAAAACAAAACAAAAAACTAT GTAT TT GCACAGCACACT AT 



1621 
1681 



T AAAAT AT TAAGT GT AAT TAT T T T AAC ACT CAC AGCT AC AT AT GAC ATT T TAT GAG CT GT 

I M I I II I I I M M I I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I M 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 



1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 
| | | M | | | | | | | || I I I I I I I I I I I I II I I I I I I I I M I I I M I I I I I I I I I I I I I M I I 
1741 T T AC G GC AT GGAAAGAAAAT CAGT GGGAAT T AAGAAAG CCT C GT C GT GAAAGCACT TAAT 

1801 T T T T T AC AGT T AGC ACT T CAAC AT AG CT CT T AACAACT T C C AG GAT AT T C AC ACAAC ACT 

M | | | | I I I I I II II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I i M I I I 

1801 T T T T T AC AGT TAG CACT T CAAC AT AG CT CT T AAC AACT T C C AGGAT AT T CAC AC AAC ACT 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

M I I I M I I I I I I I I I I I I I I I I I I I I M M I I II I I I I I I M I I I I I I I I I I M 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 
1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAATAAGT CACT GTAAAACAGAACTTTTAAAT G 

I I I | | I M | I II I I I M M I I I I I I I II I I I I I I I M I I I I I I I M I M I I I I M 

1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAATAAGT CACT GTAAAACAGAACTTTTAAAT G 

1981 AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT CCT T T AAAACAACT T T T CAAT TAAT AT 

| | | | || | | | | |1 | I I I I I I I I I I I I I I I M I I I I II I I I I I I I M I I I I 

1981 AAG C T T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAAC T T T T CAAT TAAT AT 



1620 

1620 

1680 

1680 

1740 

1740 

1800 

1800 

1860 

1860 

1920 

1920 

1980 

1980 

2040 

2040 



2041 TAT CACACTAT TAT CAGATT GTAAT TAGAT GCAAAT GAGAGAGCAGTT T AGTT GTT GCAT 2100 



Ill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

2041 TAT C AC. 



ACT AT TAT CAGAT T GT AATT AGAT GC AAAT GAG AGAGCAGT T T AGT T GT T G CAT 2100 



2101 
2101 

2161 TTTTG. 



TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

I | | | M | || | M M I I I I I II I M I I I I I I I M I I I I II I I I I I I I M I I I I I 

TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

AAAAT CATTACACT TT CACTAGAAGCCCAAACCT CAGCATT CT GCAATATGT AAC 2220 



2161 
2221 



I I I | | | I M I I I I I I I I I I I I M I I I I I I II I I I I M I I M I I I I I I I I I I I I I I I I M I 

TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 



CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 

IMIII! IIMIIIIIIII INN I I I I M M I I I I M II I 

III'' 1 -~ ^u-n j-t il -i ri->mm n n « IV 

2221 CAACATG 



r GT CACAAACAAGCAGCAT GT AACAGACT GGCACAT GT GCCAGCT GAATTT AAAA 22 80 



2281 T ATAAT ACT T T T AAAAAGAAAAT T ATT ACAT C CT T TAC AT T C AGT T AAGAT CAAACCT C A 234 0 

I | | I [ 1 t I I 1 11 I M I I I I I 1 I I I M 1 1 I I I I I I I 1 I I t i I I M I I I I I I I I I 

2281 T AT AAT ACT T T T AAAAAGAAAAT TAT T ACAT C CT TT AC AT T CAGTT AAGAT C AAAC CT C A 2340 
2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

I I I M i M I I I M M t I M I I I I M M I I 1 I I M I M I I) I M I I M I II I I M I I t I I I 

2341 CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C C AAAAGACT T T T T T GAAT CT GT CAT T C A 240 0 
CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T TT C AG GAT TAT TAAAAT CTT CT TT T T 24 60 

MMM MINIM I I I I M I I I I I I I I I I I I I I I I II I I I I M I I I I I 

CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2 4 60 



2401 
2401 
2461 



TCACTATCGTAGCTT7WVCTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

I I I I I I MIMIM I I I I I I II I I I I I I I II I I I I I I I M I I I I I I I I I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 



2521 
2521 
2581 
2581 
2641 



CTGC AT GT AGAT GATT AAAT GAGGGCAGGCCCT GTGCT CATAGCTTTACGATGGAGAGAT 

I I I I I I I | | | I M I M I I I I I I I I I I I I M M M I I I M I M I I M I II M M I II I I II 

CT GCAT GT AGAT GAT T AAAT GAGGGCAG GC C CT GT GCT CAT AG CTT TAC GAT GG AGAGAT 

GC CAGT GACCT CAT AAT AAAGACT GT GAACT GC CT GGT GC AGT GT C C AC AT GACAAAGG G 

| | | | | | | I I I I I I I I I I I I I I I I I II I I I M I I I II I I I I I M I I I I I I I I I 

GCCAGT GACCT CAT AAT AAAGACT GT GAACT GCCTGGT GCAGT GT CCACAT GACAAAGGG 



GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I I | | I I M I I I M M I I I I M I M I I I I M I I I I I M I I I M M II I I M I M I I M I M 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2701 GCT AT AGT TAAAAT ACT AT T T T T CAAAAT C AT ACAGAT T AGT ACAT T T AAC AGCT AC CT G 

| | | | | | | | | | | M II I I I M I M I I M I II I I I I I I M M I I M II I I I M I M I I I I M 
27 01 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

27 61 TAAAGCTTATT ACT AATT TTTGT AT TATTTTTGT AAAT AGCCT^ATAGAAAAGTTTGCTTG 

I I I I | | I I II I I I I II M M M I I I I M I I I I I I I M I I I I I M I M I I M 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

| | | M || | | | | M I I I I I I M I I I I I I M I M II I I I I M M I I II I I I I I I M I I I I I I 
2 821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 



2520 

2520 

2580 

2580 

2640 

2640 

2700 

2700 

2760 

2760 

2820 

2820 

2880 

2880 



28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

I I | | | I | | | M I M I I I M I I I I M M I I I I I I I I M I I I II M I I MM 



2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 



2941 GGGAT GAGAT GT GT GT GAAAGT AT GT AC AAG AGAAAAC G GAAGAGAGAGGAAAT GAGGT G 

| | I | I t I 1 I I I I I I I I I I I 1 I L I I I I I I t I I 1 I I 1 I I I I I I I M I I I I I I I I I I 

2941 G GGAT GAGAT GT GT GT GAAAGT AT GT AC AAG AGAAAAC G GAAGAGAGAG GAAAT GAGGT G 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

| | | 1 | | | | | | | | I I | | I I I I I I I I I I II I I I M II I I I I I I I I I I M I I I I 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 

MIIIIIIMIIIIIMIIIIIMIIIIMIIIIIIIIIIIIIIIIIMIIIMIIIMI 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 



3000 
3000 
3060 
3060 
3120 
3120 
3180 



3121 GAGT GACT T T C GAAAT AAAT T GG G C C CAAGAGCT T T AACT C GGT CT T AAAAT AT GC C CAA 

M I I I M M I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I M II II I I M I I I I I M I I 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 



3181 
3181 
3241 



ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

MIIIIMIIIIIIIIMMMIIIMIIIMIMII MM 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 



TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT ACAGTAAACCAAAAC CC AACAAT GT GGCC A 

1 I I I | 1 I I I I I I I I II M I II I I I I I I I I II M II II II II M I I I M I I 

3241 TT GTTTT CT GT CAAT ATT GAAT GT GAT GGT ACAGTAAAC CAAAAC CC AACAAT GT GGCCA 



33 01 GAAAGAAAGAGCAATAATAATTAATT CACACACCATAT GGATT CTATTT ATAAAT CACCC 

I I I | | | | | I I I I I I I M II I I I I I I I I I I M I M I I I I I M M M I I I I I I I I I I I I M I 

G AAAGAAAGAGC AAT AAT AAT T AAT T C AC AC AC CAT AT GGAT T CT AT T TAT AAAT C AC C C 



3301 
3361 
3361 



ACAAACT T GT T CTTT AAT T T CAT C C CAAT C ACTT T TT C AGAGGC CT GT TAT C AT AGAAGT 

I I | | | | | | | I I I I M I I I I I I I M I I I I I I I 1 1 I I I M Ill IMMIMI 

ACAAACT T GT T CTTT AAT T T CAT CC CAAT C ACTT T TT C AGAGGC CT GT TAT CAT AGAAGT 



3421 CAT T T T AGACT CT CAATT T T AAATT AAT T TT GAAT CACTAAT AT T T T C ACAGT T T ATT AA 

I | | | M M I I II I II M I I I II I M M M I M I II I I II I Ml 

3421 CAT T T T AGACT CT CAATT T T AAATT AAT T T T GAAT CACTAAT AT TT T C AC AGT T TAT T AA 
34 81 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 

Ml I I M ! I II I I I M M I I I I I I I M M I I I II I I I M II M I I 

34 81 TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T TT AT T AC CAT GT ACT GAAT T T T T AC A 



TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

I I | 1 1 I I I I I I I II M I II I II II I I I I I I I M M I II II I I lllll M I Mm 



3240 
3240 
3300 
3300 
3360 
3360 
3420 
3420 
3480 
3480 
3540 
3540 
3600 



3541 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 



3601 T GAAACT AC AC AC AAAAAGCAT ACT T G CAT TAT T TAT AAT AAAAT T G CAT T C AGT GGCT T 3660 

I | | M II M II I II M I Ml I M II I I II M I M II I Ml M II II M I 

T GAAACT ACACACAAAAAGCAT ACT T GC AT T ATTTATAATAAAATT GCATT CAGT GGCT T 



3601 
3661 
3661 



T T T AAAAAAAAT GTT T GAT T C AAAACT T T AACAT ACT GAT AAGT AAGAAAC AAT TAT AAT 

I | | M | M || I M I I II I II II M I II I II M I I I I M I II I II I II II I I M I II II I I 

T T T AAAAAAAAT GT T T GAT T CAAAACT T T AACAT ACT GAT AAGT AAGAAAC AAT TAT AAT 



3721 T T CT TT ACAT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GTT C AACT T CAAAAC AT GT 
| | | || || || || M II I I M M I I I I I M M M I I I I I I M I I I M I I I I II M M I I M I 
3721 T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAG GT GCT AT CGT T CAACT T CAAAAC AT GT 



3660 
3720 
3720 
3780 
3780 



Qy 


3781 


TT CCT AGTATTAAGGACT TT AAT AT AGCAACAGACAAAATTATT GTTAACAT GGAT GTT A 

1 I 1 M M 1 1 1 II 1 1 M 1 II M 1 M 1 1 1 1 1 1 M 1 1 1 > 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 


3840 


Db 


3781 


3840 


Qy 


3841 


CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

. i ■ i i i i i i i i i i i i i i I I I 1 I 1 1 I 1 t 1 1 

I I I 1 I | | | M 1 M 1 1 1 M 1 1 M 1 1 1 1 M 1 1 M 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


3900 


Db 


3841 


3900 


Qy 


3901 


GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

| | | | | | | | | I I I I 1 1 II 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 

GT GGAT GT AT GTT CAAAC ACCTT T T AGT AT T GAT AGCTT AC AT AT GG C CAAAG G AATAC A 


3960 


Db 


3901 


3960 


Qy 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

| | || | || 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 


4020 


Db 


3961 


4020 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

MIMIIIIIIIIIIIMIIIIMIIIIIIIMIMMIIIIIIIIIIIM Ml 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


4080 


Db 


4021 


4080 


Qy 


4081 


TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

| | | | M 1 1 1 1 1 1 1 | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 


4140 


Db 


4081 


4140 


Qy 


4141 


ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAAATATAAATGTGA 

* IILlllillllllllllllttltfllL 

I 1 | | | | I I I I M I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 

AC T GT AC AGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T TAAAAGAAAT AT AAAT GT GA 


4200 


Db 


4141 


4200 




4201 


CAAGT GGACATTATTTAT GTTAAAT ATACAATT ATCAAGCAAGT AT GAAGTT ATT CAATT 

| || | | | | | | | | | M II 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

CAAGT GGACATTATTTAT GTTAAATAT ACAATT AT CAAGCAAGT AT GAAGT TAT T CAATT 


4260 


Db 


4201 


4260 


Qy 


4261 


AAAAT GC CAC ATTT CT GGT CT CT GGGAAAAAAAAAAAAAAA 4301 

I || | | 1 I 1 M 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 M MM 

AAAATGCCACATTTCT GGT CTCT GGGAAAAAAAAAAAAAAA 4301 




Db 


4261 





RESULT 2 

US-08-910-864-13 

; Sequence 13, Application US/08910864 
; Patent No. 6280931 
; GENERAL INFORMATION: 
; APPLICANT: SAKAMOTO, AIJI 
APPLICANT: HANAOKA, FUMIO 

TITLE OF INVENTION: METHOD FOR SPECIFICALLY AMPLIFYING A cDNA OF AN 
EXTREMELY 

; TITLE OF INVENTION: SMALL QUANTITY 

; NUMBER OF SEQUENCES : 13 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: FISH & RICHARDSON P.C. 

STREET: 4225 EXECUTIVE SQUARE, SUITE 1400 
; CITY: LA JOLLA 

STATE: CA 
; COUNTRY: USA 

ZIP: 92037 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/ 910 , 8 64 

FILING DATE: 13-AUG-1997 
; CLASSIFICATION: 435 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: JP 216506/1996 

; FILING DATE: 16-AUG-1996 

ATTORNEY/AGENT INFORMATION: 
; NAME: HAILE, LISA A. 

REGISTRATION NUMBER: 38,347 
; REFERENCE/ DOCKET NUMBER: 07898/017001 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 619-678-5070 

TELEFAX: 619-678-5099 
; INFORMATION FOR SEQ ID NO: 13: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1873 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: double 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA to mRNA 
FEATURE: 

NAME /KEY: CDS 
LOCATION: 231.. 1556 
US-08-910-864-13 



Query Match 39.3%; Score 1691.8; DB 3; Length 1873; 

Best Local Similarity 99.6%; Pred. No. 0; 

Matches 1696; Conservative 0; Mismatches 7; Indels 0; Gaps 



Qy 


178 


TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 

|| || | | | | I 1 I 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 

TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 


237 


Db 


171 


230 


Qy 


238 


ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 

| | | M 1 1 1 1 1 1 1 M 1 1 1 11 II 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ATGCAGCCGCCTCCAAGTCTGTGCGGACCGGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 


297 


Db 


231 


290 


Qy 


298 


CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 

| | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 

CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 


357 


Db 


291 


350 


Qy 


358 


CAAAC C GCAGAGAT AAT GAC GC C AC C C ACT AAGAC CT T AT GGC C CAAGG GT T C CAAC GCC 

M 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 N 1 1 

CAAAC C GCAGAGAT AAT GAC GC C AC C C ACT AAGAC C T TAT GGC C CAAGG GT T C CAAC GC C 


417 


Db 


351 


410 


Qy 


418 


AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 

| I I | | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 

AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 


477 


Db 


411 


470 


Qy 


478 


C C GC C AC GCAC CAT CTCCCCTCCCCC GT G C C AAGGAC C CAT C GAGAT CAAGGAGACT T T C 

| 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 

C C GC CAC GCAC CAT CTCCCCTCCCCC GTGC C AAGGAC C CAT C GAGAT CAAGGAGACT T T C 


537 


Db 


471 


530 



538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 

| | M M | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I 

531 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 590 

598 ACACTT CT GAGAAT TAT CT ACAAGAACAAGT GCAT GCGAAACGGT C CCAAT AT CTT GAT C 657 

I I I I I 1 I I I I I I I I I I M I I I I I I I I I M I I I I II I M I I I I 

591 AC ACT T CT GAGAAT TAT CT AC AAGAACAAGT G CAT GC GAAAC G GT C C CAAT AT CTT GAT C 650 

658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

| | | | | M | | | | I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I M I I I I I I II 
651 G C C AGCT T GGCTCTGG GAGAC CT GCT G CACAT C GT CAT T GACAT CC CT AT CAAT GT CT AC 710 

718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

| | I I I I I I I I I I I I I I I I I M II I I I I 1 I I I I I I M I I I I I II I I I I M I I I I I I I I I I I 
711 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 77 0 

77 8 CAGAAAGCCT CCGT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGTATT GAC AGAT AT 837 

11,11!; Ill MINIMI II I I I I I I I I I I I M I I I I I I I 

771 CAGAAAGCCT CCGT GGGAAT CACT GT GCT GAGTCTATGT GCT CTGAGT ATT GACAGAT AT 830 

8 38 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 897 

| | | | || | M Ill I I I I I I I I I I I I I M I I I I I I I M I I I I 

831 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 890 

898 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

| | | M M II I I I I I I I I I I I M I II I I I I I I I I M I I I I I M I I I I I I I I I I I I I 

891 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 950 

958 GAT AT AAT T AC GAT G GACT ACAAAGGAAGT TAT C T GC GAAT CTGCTTGCTT CAT C C C GT T 1017 

| | | | | M I I I I II I i M I I I II I I I I I I I I I I I M I I I I II I I I I I I I M I I I I II M I I 
951 GATATT^TTACGATGGACTACAAAGGT^AGTTATCTGCGAATCTGCTTGCTTCATCCCGTT 1010 

1018 C AGAAGAC AGC T T T CAT GC AGT T T T ACAAGACAGCAAAAGAT T G GT GGCT GT T C AGT TT C 1077 

| | | | | M | | I I I I I I I I II I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I 

1011 C AGAAGAC AG C T T T CAT GCAGT T TT AC AAGACAGCAAAAGAT T GGT GGCT AT T C AGT T T C 1070 

1078 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 1137 

| | | | I I I M I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I M Ml 

1071 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 1130 

1138 T T GAGAAAGAAAAGT GG C AT GCAGAT T GCT T TAAAT GAT CAC CT AAAG CAGAGAC G GGAA 1197 

| M | | | | | I I I I I II I I I I I I I I I I I I I I I I I I II I I I M I I I II I I I I I I I M M I I I I 
1131 TTGAGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAA 1190 

1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1257 

|| I I I I I I I I I I I I I I II I I II I I I I I II 

1191 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1250 

12 58 C T C AGC AG GAT T CT GAAGCT CAC T CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAAC T T 1317 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

1251 C T C AG C AG GAT T CT GAAGCT CAC T CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T 1310 

1318 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 1377 

I I I I 1 I I I I I I II I I I I I I I I M I I I I I I M I I I M M I I I I M I I 

1311 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 137 0 



QY 


1378 


ATTT^ACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 

Ml | | | | | | | I I | | I II 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 I 1 1 1 

ATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGC 


14 j / 


Db 


1371 


14 o(J 


QY 


1438 


TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 

| | | | | | | | | M II M 1 1 1 II 1 1 1 1 1 Ml 

T TAT GCTGCTGGT GC CAGT CAT T T GAAGAAAAAC AGT C CT T G GAGGAAAAG C AGT C GT GC 


14 y / 


Db 


1431 


1490 


QY 


1498 


T T AAAGT T CAAAGCT AAT GAT C AC G GAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGC 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 

T T AAAGT T CAAAGCT AAT GAT CAC GGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGC 


1557 


Db 


1491 


1550 


QY 


1558 


T CAT CT T GAAAGAAGAACT AT T C ACT GT AT T T C AT TT T CT T TAT AT T GGAC C GAAGT CAT 

M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 M 

T C AT CT T GAAAGAAGAACT AT T C ACT GT AT T T CAT TT T C T T TAT AT T GGAC C GAAGT CAT 


1617 


Db 


1551 


1610 


QY 


1618 


T AAAACAAAAT GAAAC AT T T GC CAAAACAAAACAAAAAACT AT GT AT T T GC ACAGCAC AC 

MINIM! IMIIII 1 1 II 1 M 1 1 II 1 1 1 1 M 1 1 1 1 1 1 MINI 

T AAAACAAAAT GAAAC AT T T G C CAAAACAAAACAAAAAACT AT GT AT T T G C ACAGCAC AC 


1677 


Db 


1611 


1670 


QY 


1678 


TATTAAAATATTAAGT GTAATT ATTTTAACACT CACAGCTACAT AT GACATTTTAT GAGC 

| | | | | | | | || | 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M II M 1 1 1 1 1 1 1 M 1 

TAT T AAAAT ATT AAGT GT AAT T ATT T T AAC ACT CACAGCTACAT AT GAC AT T T TAT GAGC 


1737 


Db 


1671 


1730 


QY 


1738 


TGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTT 

IIIMIIMIIIIIIMIIIMIIIIIIIIIIIIIIIIIMIIIIIIIMIIIIIIIIII 

TGTTTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTT 


1797 


Db 


1731 


T "~l C\ A 

i /yu 


QY 


1798 


AATTTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACAC7\AC 
| | | | | | I M | I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 
AAT TT T T T AC AGT T AGC ACT T C AAC AT AGC T CT T AACAACT T C C AGGAT AT T CAC ACAAC 


loo / 


Db 


1791 


1850 


QY 


1858 


ACTTAGGCTTAAAAATGAGCTCA 188 0 

II 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 
ACTTAGGCTTAAAAATGAGCTCA 1873 




Db 


1851 





RESULT 3 

US-09-016-434-1203 

; Sequence 1203, Application US/09016434 
; Patent No. 6500938 

GENERAL INFORMATION: 
; APPLICANT: Janice Au-Young 
; APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
; TITLE OF INVENTION: PATHWAY GENE EXPRESSION 

NUMBER OF SEQUENCES : 14 90 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 
STREET: 3174 PORTER DRIVE 
; CITY: PALO ALTO 

; STATE: CALIFORNIA 

COUNTRY: USA 
; ZIP: 94304 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 



OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Word Perfect 6.1 for Windows /MS-DOS 6.2 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/016, 434 

FILING DATE: HEREWITH 
; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 
; FILING DATE: 

CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 
; REFERENCE/ DOCKET NUMBER: PA- 00 02 US 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (650) 855-0555 

TELEFAX: (650) 845-4166 
; INFORMATION FOR SEQ ID NO: 1203: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1470 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS : single 

; TOPOLOGY: linear 

IMMEDIATE SOURCE: 

LIBRARY: GENBANK 
; CLONE: gl82275 

US-09-016-434-1203 



Query Match 34.1%; Score 1466.8; DB 4 ; Length 1470; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1468; Conservative 0; Mismatches 2; Indels 0; Gaps 



Qy 


192 


GAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCC 

|| MM IMMMMIMMMMMIMMMMMMMMMMMMM 

GAAACTGCGGACGGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCC 


251 


Db 


1 


60 


QY 


252 


AAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTG 

M M M M M M M M M M M M M M M M M M M M M M M M M M 1 M M M 1 

AAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTG 


311 


Db 


61 


120 


Qy 


312 


GG GAGAG GAGAGAGG CT T C C CGC CT GAC AGGGC C ACT C C GCT T T T GCAAAC C GC AGAGAT 

M 1 1 1 M M M M M M M 1 II M M 1 1 1 1 1 1 1 1 1 1 1 ' 

GGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGC AGAGAT 


371 


Db 


121 


180 


Qy 


372 


AATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTC 

|| | || M 1 | II II II 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 M M II 1 1 II 1 1 M M 1 1 1 1 1 1 

AATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTC 


431 


Db 


181 


240 


Qy 


432 


GTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCAT 

| M 1 1 1 1 1 1 1 1 M 1 M II 1 1 1 M 1 1 1 M M 1 M 1 1 1 1 M 1 M 1 II 1 M i 1 1 1 1 1 1 1 M II 

GTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCAT 


491 


Db 


241 


300 


Qy 


492 


CTCCCCTCCCCCGTGC CAAGG AC C CAT C GAGAT C AAGGAGACT T T CAAAT ACAT CAAC AC 

1 M II II M 1 1 1 1 1 1 1 1 1 1 M Ml II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

CTCCCCTCCCCCGTGC C AAG GAC C CAT C GAGAT C AAG GAG AC T T T CAAAT AC AT CAAC AC 


551 


Db 


301 


360 


Qy 


552 


GGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGAAT 


611 



361 GGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGAAT 42 0 

612 TATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCT 671 

|| | | I 1 I I I I I I I I M I I I I I I I MINI I 

421 TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C G C C AG CT T GGCT CT 4 80 

672 GGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCAGA 731 

I M | I 1 i I M M 11 I I I I I I I I i I I I I I I M M M I M MINIUM 

481 GGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCAGA 540 
732 GGAC T GG C CAT T T GGAG CT GAGAT GT GT AAG CTGGTGCCTTT CAT ACAGAAAG C C T C C GT 7 91 

MM I II II I II I I M II II I M II M I II I M I 

541 GGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT ACAGAAAGCCTCCGT 600 

7 92 GGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGTTGCTTC 851 

M I I I I I I I I I 1 I I I I M I 1 I 1 I 1 I I I 1 I 1 I I I i M I I M I I I 1 I I i t I I II I 

601 GGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGTTGCTTC 660 

852 T T G GAGT AGAATTAAAGGAAT T GGGGT T C CAAAAT GGACAGC AGT AGAAAT T GT T T T GAT 911 

|| | | || || M M I I I I II I I M I I I M I I I I M I I M I 

661 T T GGAGT AGAATTAAAGGAAT T GGG GT T C CAAAAT GGAC AGC AGT AGAAATT GT T T T GAT 720 

912 TTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACGAT 971 

I | | I I II I II II I I II M I I II M I I II M II I M M II I M I I II 

721 TTGGGT GGT CTCTGTGGTTCT GGCT GTCCCTGAAGCCATAGGTTTTGATATAATTAC GAT 780 
972 GGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAAGACAGCTTT 1031 

| | M I I M I I I M M I I II I II I I II I I I I M I I I I I I I I M I II I M I I II I I I M I I I 

781 GGAC T ACAAAGGAAGT TAT CT G C GAAT CTGCTTGCTT CAT C C C GT T CAGAAGAC AG CTT T 840 
1032 CATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCC 1091 

| || M | M I I I I I M I I I II II I I II II I I I M I I I M II M II I I M I I I II I I I I I I I 

841 CAT G C AGT T T T ACAAGACAGCAAAAGAT T GGT GG CT GTT CAGT T T CT AT TTCTGCTTGCC 900 
1092 AT T GGC CAT C AC T GCAT T T TT T T AT AC ACT AAT GAC CT GT GAAAT GT T GAGAAAGAAAAG 1151 

| | M I M II I I I II M I I M M I II II I I I I I II I M I II I M II II M II M I M I M I 

901 AT T GGC CAT CAC T GCAT T T T T T TAT AC ACT AAT GAC CT GT GAAAT GT T GAGAAAGAAAAG 960 
1152 T GGC AT GCAGAT T G CT T TAAAT GAT CAC CTAAAG CAGAGAC GGGAAGT GGC CAAAAC C GT 1211 

I | | M M II I II I II I M II I M I I I I I II I I II II I I M I M I I M M 

961 T G GC AT GCAGAT T G CT T TAAAT GAT CAC CTAAAG CAGAGAC GG GAAGT GGC CAAAAC C GT 102 0 
1212 CTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCT 1271 

| | | || | | | M I I I I I M M I I II II I I I I I I I II I II II II I I I I I II II M M I M I M 

1021 CTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCT 1080 
1272 GAAGCT CACT CTT TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T T T GAGCTT T CT GT T 1331 

| M II II I M I I I I I I I II I I II M I I I II I M M M I I II II I I I I I I I II II I I I I I I 

1081 GAAGCT CACT CTT TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T T T GAGCTT T CT GT T 1140 
1332 GGT AT T GGACT AT AT T GGT AT CAACAT G G CT T CACT GAAT T C CT G CAT T AAC C CAAT T G C 1391 

I I I I 1 I I I t MM I I I 1 I I 1 I I I L t I I 1 I f I 1 M M I I I 

1141 GGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCAATTGC 1200 
1392 TCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGGTG 1451 

I I I I I II I I M M I I I I M M I M I I I I M I I II I M I II I M II I M I I 



Db 



1201 TCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGGTG 1260 



Qv 1452 C C AGT CAT TT GAAGAAAAAC AGT C C T T G GAGGAAAAG C AGT C GT GCT T AAAGT T CAAAG C 

| || I I I II I I I I M I I I M I I I I I I I M II I I I I I I I I 

Db 1261 C C AGT CAT TT GAAGAAAAAC AGT C C T T G GAGGAAAAGC AGT C GT G CT T AAAGT T CAAAG C 

Qv 1512 T AAT GAT C AC GGAT AT GAC AACT T C C GT T C C AGT AAT AAAT AC AGCT CAT CT T GAAAGAA 

| | | | I I I I I I I I I I I I I I I I M I I I M I I M I I I I I I M 

Db 1321 T AAT GAT C AC GGAT AT GACAACT T C C GT T C C AGT AAT AAAT AC AG CT CAT CT T GAAAGAA 



1511 
1320 
1571 
1380 
1631 



Qv 1572 GAACT AT T C ACT GT AT T T CAT TT T CT T TAT AT T GGAC C GAAGT CAT T AAAACAAAAT GAA 

| | | | M M I I I I I I I I I I I I I I I I I M I I M I I I M I I I I I I I I M I I I I M I M 

Db 1381 GAACT AT T C ACT GT AT T T CAT TT T CT T TAT AT T GGAC C GAAGT CAT T AAAACAAAAT GAA 1440 

Qy 1632 AC AT T T G C C AAAAC AAAAC AAAAAAC T AT G 1661 

I I I || I I II I I I I I 1 I I I I I I I I M I I I I I 
Db 1441 AC AT T T G C C AAAAC AAAAC AAAAAAC TAT G 147 0 



RESULT 4 

US-09-175-658B-20 

Sequence 20, Application US/09175658B 
Patent No. 6372900 
GENERAL INFORMATION: 
APPLICANT: METALLINOS , DANIKA 
APPLICANT: RINE, JASPER 
APPLICANT: BOWLING, ANN 

TITLE OF INVENTION: HORSE ENDOTHELIN-B RECEPTOR GENE AND GENE PRODUCTS 
FILE REFERENCE: GOBR-110 

CURRENT APPLICATION NUMBER: US/ 09/175, 65 8B 
CURRENT FILING DATE: 1998-10-20 
PRIOR APPLICATION NUMBER: 60/062,562 
PRIOR FILING DATE: 1997-10-21 
NUMBER OF SEQ ID NOS : 25 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 20 
LENGTH: 1321 
TYPE: DNA 
ORGANISM: Horse 
US-09-175-658B-20 

Query Match 24.9%; Score 1070.4; DB 4; Length 1321; 

Best Local Similarity 88.7%; Pred. No. 1.5e-256; 

Matches 1171; Conservative 0; Mismatches 146; Indels 3; Gaps 1; 

Qy 227 CAGGTAGCAGCATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTC 286 

| | | | M | || I I I I I I I I I I I I I I I I M I I II I I I M M I II I I I I M I I 
Db 1 CAGGTAGCAGCATGCAGCCTCTGCCAACCCTGTGTGGACGCGTTCTGGTGGCGCTGATCC 60 

0v 287 TTGCCTGCGGCCTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCA 346 

MIMIIIIII II I II I I I I I I Mill I MINIM 

Db 61 TTGCCTGCGGCGTGGCAGGGGTCCAGGGAGAAGAGAGGAGATTCCCGCCGGCCAGGGCCA 120 

Qy 347 CTCCG CTTTTGCAAACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCA 403 

| | M | Mill I I II II I M II I I I II I I M II II MIN I 

Db 121 CTCCGCCACTTCTGGGGTCTGAAGAGATAATGACGCCCCCGACTAAGACCTCCTGGCCGA 180 



Qy 


404 


AGGGTTCCAACGCCAGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACA 

| | | I I I I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 M 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGGGGTCCAACGCCAGCGTGCCGCGGTCATCAGCACCTCCGCAAATGCCTAAAGCAGGGA 


4 63 


Db 


181 


240 


Qy 


464 


GGACGGCAGGATCTCCGCCACGCAC CAT CTCCCCTCCCCCGTGCCAAGGACC CATC GAGA 

M I 1 1 II III 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II II II 1 M 1 1 M 1 1 

GGACGG C GGGAG C C C AGC GAC GC AC CCTCCCTCCTCCCCCGT GC GAAAGAAC CAT C GAGA 


523 


Db 


241 


300 


Qy 


524 


TCAAGGAGACTTTCAAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGA 

| | | | | | | || | I I I 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 
TCAAGGAGACTTTCAAGTACATCAACACAGTAGTGTCCTGCCTAGTGTTCGTGCTGGGCA 


583 


Db 


301 


360 


Qy 


584 


T CAT C GG GAACT C C ACACT T CT G AGAAT TAT C T ACAAGAACAAGT GC AT G C GAAAC GGT C 
| | | | | I I | | II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
T CAT C GGAAACT C C ACACT G CT G AGAAT CAT T T ACAAGAACAAGT GC AT GC GGAAC GG C C 


643 


Db 


361 


420 


Qy 


644 


CCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCC 

| | | | | | I I I 1 1 1 1 I 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II INN III 

CTAATATCTTGATCGCCAGCCTGGCTCTCCGAGACCTGCTGCAAATCATCATTGACGTCC 


703 


Db 


421 


480 


Qy 


704 


CTATCAATGTCTACT^AGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGC 

| | | | | | || M II 1 1 II II 1 II II 1 IIIMMIMI IMIIM IIIMIIIIIMI 
CCATCAATGTCTACAAGCTGCTGGCTGAGGACTGGCCCTTTGGAGTCGAGATGTGTAAGC 


763 


Db 


481 


540 


Qy 


764 


TGGTGCCTTTCATACAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGA 

| | | | | | | || | | I I I I I I II 1 II 1 1 1 1 1 II 1 1 II II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 
TGGTGCCTTTCATACAGAAGGCCTCCGTGGGCATCACTGTGCTGAGTCTGTGTGCTCTAA 


823 


Db 


541 


600 


Qy 


824 


GT AT T GACAGAT AT C GAGCT GTTGCTTCTTG GAGT AGAAT T AAAGGAAT T GGGGTT C CAA 

IIMIIIIIIIIIIMIMIIMIIMI 1 1 MINI Ml MINIMI 

GT AT T GACAGAT AT C GAGCT GTTGCTTCCTTG GAGC GAAT T AAAGGAAT T C GGGTT C CAA 


883 


Db 


601 


660 


Qy 


884 


AATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTG 

| | M 1 II 1 II II 1 II M M 1 1 II M 1 1 1 1 1 M 1 1 M II II M II 1 1 II 1 1 II 1 M 1 1 1 1 

AATGGACAGCAGTAGAAATTGTTTTAATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTG 


943 


Db 


661 


720 


Qy 


944 


AAGC CAT AGGTT T T GAT AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CT GCT 

Mill | M 1 1 M II M 1 Mill IMIIIMIMM MINIMUM 

AAGC C GT G GGTT T T GAT AT GAT T AC C GCT GACT ACAAAGGAAGT TAT CT GC GAAT C T G C C 


1003 


Db 


721 


780 


Qy 


1004 


T GCT T CAT C C CGT T C AGAAGACAGCT T T CAT G CAGT TT T ACAAGAC AGCAAAAGAT T GGT 

I || II 1 II II 1 II 1 N 1 IMM 1 1 N II N 1 II II 1 N 1 1 1 N II N II II 

T GCT T CAT C C C ACT C AGAAAACAG C CT T CAT G CAGT T T T ACAAGAAT GC T AAG GAC T G GT 


1063 


Db 


781 


840 


Qy 


1064 


GGCTGTT CAGT TTCT AT TTCTGCTTGCCATTGGCCATCACTGCATTTTTTT AT ACACT AA 

| I II II 1 II 1 1 N 1 1 1 II 1 II II 1 II 1 1 II II II II N N 1 1 1 1 1 N 1 II 1 N 1 1 

GGCTATTTAGTTTCTATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACCTTGA 


1123 


Db 


841 


900 


Qy 


1124 


T GAC CT GT GAAAT GT T GAGAAAGAAAAGT GGC AT GCAGATT G CT T T AAAT GAT CAC C T AA 

Mill INN lllllllllll INIINNI Nl III 

T GAC CT GT GAAAT GTT GAGAAAGAAGAGT GGCAT GCAAATT GCTTTAAAT GAT CACT TAA 


1183 


Db 


901 


960 


Qy 


1184 


AGCAGAGACGGGAAGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCT 

IMIINI INIINNI 1 1 N 1 t 1 I 1 L 1 1 t 1 1 1 1 1 t t 1 1 1 1 1 I 1 INI 

AGCAGAGAAGGGAAGTGGCGAAAACAGTATTCTGCCTGGTCCTTGTCTTTGCCCTGTGCT 


1243 


Db 


961 


1020 


Qy 


1244 


GGCTTCCCCTT CAC CT C AGCAGGAT T CT GAAG CT CACT CT T TAT AAT CAGAAT GAT C C C A 


1303 



Db 


1021 


M 1 1 1 1 1 1 1 II I MM 1 M M M M 1 M 1 II 

GGCTTCCTCTT C AC CT CAG CAGGAT T T T GAAAC ACACT C T T TAT GAT C AGAAT GAT C C C C 


1080 


Qy 


1304 


AT AGAT GT GAAC T T T T GAGCT T T C T GT T G GT AT T GGACT AT AT T GGT AT C AAC AT G G CT T 

M 1 II 1 II II 1 II 1 M II 1 1 1 1 1 M 1 1 II 1 Mill 1 

AT AGAT GT GAACT T T T GAG CT T T T T GT T GGT AT T GGAC TACAT T G GC AT CAAC AT GGC CT 


1363 


Db 


1081 


1140 


Qy 


1364 


CACT GAAT T C CT GCAT TAAC C CAATT G CT CT GT ATT T GGT GAGCAAAAGAT T CAAAAACT 

I MMMMMMIMM Mill 1 MIMMIMM MMM 

C C CT GAAT T C CT G CAT T AAT C CAAT AG CT C T GT AT T T G GT GAGCAAAAGAT T CAAAAACT 


1423 


Db 


1141 


1200 


Qy 


1424 


GCTTTAAGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGG 

1 1 II 1 1 1 1 1 1 1 M 1 M 1 ! 1 1 1 1 1 II 1 II II 1 II II II II 1 II II 1 1 1 1 M 1 1 1 1 1 1 1 
GCT TTAAGTCGTGCTTATGCTGCT GGT GC CAAT CAT TTGAAGAAAAACAGTCCTTGGAAG 


1483 


Db 


1201 


1260 


Qy 


1484 


AAAAGCAGT C GT G CT TAAAGT T CAAAG CT AAT GAT CAC GGAT AT GAC AACT T C C GT T C CA 

I II 1 1 1 1 1 1 1 II II II II 1 II 1 M 1 1 1 1 1 M II II 1 1 II 1 M II 1 M 1 II 1 1 1 1 1 1 1 1 

ACAAGC AGT CAT GCT TAAAGT T CAAAG CT AAT GAT CAC G GAT AT GACAAC T T C C GT T C C A 


1543 


Db 


1261 


1320 



RESULT 5 

US-09-016-434-1257 

; Sequence 1257, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

; APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
TITLE OF INVENTION: PATHWAY GENE EXPRESSION 
NUMBER OF SEQUENCES: 1490 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

STREET: 3174 PORTER DRIVE 
; CITY: PALO ALTO 

; STATE : CALI FORNIA 

COUNTRY: USA 
; ZIP: 94304 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/016, 434 

FILING DATE: HEREWITH 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 
; FILING DATE: 

CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Zeller, Karen J. 

; REGISTRATION NUMBER: 37,071 

REFERENCE/ DOCKET NUMBER: PA-0002 US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 
; TELEFAX: (650) 845-4166 



INFORMATION FOR SEQ ID NO: 1257: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4079 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GENBANK 
CLONE: g219649 
US-09-016-434-1257 

Query Match 9.2%; Score 395.8; DB 4; Length 4079; 

Best Local Similarity 65.7%; Preci. No. 2.5e-88; 

Matches 634; Conservative 0; Mismatches 307; Indels 24; Gaps 3; 

Qy 505 T GC CAAGGAC C CAT C GAGAT CAAGGAGACT T T C AAAT AC AT CAACAC GGT T GT GT C CT GC 564 

| 1 | I | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 685 T GC C C ACAG C AGACT AAAAT TACT T C AGCT T T C AAAT AC AT T AAC AC T GT GAT AT CT T GT 744 

Qy 565 CT T GT GT T C GT G CT G GGGAT CAT C G GGAACT C C ACACT T CT GAGAAT TAT CT ACAAGAAC 624 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M 
Db 745 ACTATTTTCATCGTGGGAATGGTGGGGAATGCAACTCTGCTCAGGATCATTTACCAGAAC 8 04 

Qy 625 AAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTG 684 

| | M | I I I I I I I IN I I I I I I M II II I I I 

Db 8 05 AAAT GT AT GAGGAAT G GC C C CAAC GC GCT GAT AGC C AGT CTTGCCCTT GGAGACCT TAT C 8 64 

Qy 685 CACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCAGAGGACTGGCC 7 40 

| | I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I MINI 
Db 865 TATGTGGTCATTGATCTCCCTATCAATGTATTTAAGCTGCTGGCTGGGCGCTGGCCTTTT 924 



Qy 



741 AT T T GGAGCT GAGAT GT GT AAGCT GGT G C CT T T CAT AC AGAAAGC CT C C 7 89 

| | | | I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 925 GATCACAATGACTTTGGCGTATTTCTTTGCAAGCTGTTCCCCTTTTTGCAGAAGTCCTCG 984 



Qv 790 GT GGGAAT CACT GT GCT GAGT CT ATGT GCT CT GAGT ATT GACAGAT AT CGAGCT GTT GCT 849 

| | | I I II I II M I I I I I I M I I M I I I M M 

Db 985 GTGGGGATCACCGTCCTCAACCTCTGCGCTCTTAGTGTTGACAGGTACAGAGCAGTTGCC 1044 

Qy 850 T CT T G GAGT AGAAT T AAAGGAAT T GG GGT T CCAAAAT GGACAGC AGT AGAAATT GTT T T G 909 

M | M I I I I I I I I I I I I I I I I I I M I I N I I I I I I I I I I 

Db 1045 TCCTGGAGTCGTGTTCAGGGAATTGGGATTCCTTTGGTAACTGCCATTGAAATTGTCTCC 1104 

Qy 910 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 969 

| | | | | I IN I I I I I I I I I I I I I I II I I II I I I II I I 
Db 1105 ATCTGGATCCTGTCCTTTATCCTGGCCATTCCTGAAGCGATTGGCTTCGTCATGGTACCC 1164 

Qy 970 AT GGACT ACAAAG GAAGT TAT C T GC GAAT C T G C T T GCT T CAT C C C GT T C AGAAGAC AGCT 1029 

| M || | || II I I I I I II I I I I I I I M 

Db 1165 T T T GAAT AT AG GG GT GAACAGC AT AAAAC CT GTAT GCT CAAT GC C ACATCAAAA 1218 

Qy 1030 TTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTG 1089 

| | | | | | || | | | | | I I I II I I I I I I I I I I I I I I I I II I I I I I I I H 

Db 1219 TTCATGGAGTTCTACCAAGATGTAAAGGACTGGTGGCTCTTCGGGTTCTATTTCTGTATG 1278 

Qy 1090 C CAT T GGC C AT CACT GC AT T T T T T TAT AC AC T AAT G AC C T GT GAAAT GT T G AGAAAG 114 6 

It lilt I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I Mill 



Db 1279 CCCTTGGTGTGCACTGCGATCTTCTACACCCTCATGACTTGTGAGATGTTGAACAGAAGG 1338 

Qy 1147 AAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAA 1206 

|| | MM I I M I M M M I M M I M 

Db 1339 AAT GGC AG C T T GAGAAT T G C C CT C AGT GAAC AT CT T AAGC AG C GT C GAGAAGT GGCAAAA 1398 

Qy 12 07 ACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGG 1266 

M M M Ml I I M M M I I I I II M M I M I I 

Db 1399 ACAGTTTTCTGCTTGGTTGTAATTTTTGCTCTTTGCTGGTTCCCTCTTCACTTAAGCCGT 1458 

Qy 1267 AT T CT GAAG CT CACT C T T TAT AAT CAGAAT GAT C C C AAT AGAT GT GAACT T T T GAGCT T T 1326 

|| Mill M I I I I I M II M M I I I M I M I I I I II 

Db 1459 AT AT T GAAGAAAACT GT GT AT AAC GAAAT GGACAAGAAC C GAT GT GAAT T ACT T AGT T T C 1518 

Qy 1327 CT GTT GGT AT T G GACT AT AT T GGT AT C AAC AT G GCT T CACT GAAT T C CT G CAT T AAC C C A 138 6 

I M I II I I I I M I M M I I II I II I I 

Db 1519 TTACTGCTCATGGATTACATCGGTATTAACTTGGCAACCATGAATTCATGTATAAACCCC 1578 

Qy 1387 ATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGC 144 6 

II I I II I I M I I I II M I M M M M INN Nil 

Db 1579 ATAGCTCTGTATTTTGTGAGCAAGAAATTTAAAAATTGTTTCCAGTCATGCCTCTGCTGC 1638 

Qy 1447 TGGTG 1451 

WW 

Db 1639 TGCTG 1643 



RESULT 6 
US-08-121-446-1 

; Sequence 1, Application US/08121446 
; Patent No. 6313276 
; GENERAL INFORMATION: 

APPLICANT: IMURA, HIROO 
; APPLICANT: NAKAO, KAZUWA 

APPLICANT: NAKANISHI, SHIGETADA 

TITLE OF INVENTION: A HUMAN ENDOTHELIN RECEPTOR 
; NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: MORRISON & FOERSTER 

STREET: 755 Page Mill Road 

CITY: Palo Alto 
; STATE: California 

; COUNTRY: USA 

ZIP: 94304-1018 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 8/ 121 , 44 6 
; FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/911,684 
; FILING DATE: 10-JUL-1992 

ATTORNEY/AGENT INFORMATION: 



NAME: CIOTTI, THOMAS E. 
REGISTRATION NUMBER: 21,013 
REFERENCE/ DOCKET NUMBER: 29900-20324.00 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 813-5600 
TELEFAX: (415) 494-0792 
TELEX: 706141 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4105 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 



FEATURE : 
NAME/ KEY: 
LOCATION: 
FEATURE : 
NAME/ KEY: 
LOCATION: 
US-08-121-446-1 



CDS 

485. .1768 

mat_peptide 
545 



Query Match 9.2%; 
Best Local Similarity 65.7%; 
Matches 634; Conservative 



Score 395.8; DB 4; 
Pred. No. 2.5e-88; 
0; Mismatches 3 07; 



Length 4105; 
Indels 24; 



Gaps 



3; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



505 T GC CAAGGAC C CAT C GAGAT CAAG GAG ACT T T CAAAT ACAT CAACAC GGT T GT GT C CT GC 564 

| | | | | | | I I I I I I I I I I I I II I I I I I I I I M I I I I I 

68 9 T GC C CACAGC AGACT AAAAT T ACT T CAGCT T T CAAAT AC AT T AAC ACT GT GAT AT CT T GT 74 8 

565 CT T GT GT T CGT GCT GGGGAT CAT C GG GAACT CCAC ACT T CT GAGAAT TAT CT ACAAGAAC 624 

| | M | | MM II I II II I I II I I II M I I I I I I I I I I I I 
74 9 ACT AT T T T CAT C GT GGGAAT G GT GG GGAAT GCAACT CT G CT C AGGAT CAT T T AC CAGAAC 808 

625 AAGT GCAT GC GAAAC GGT C CCAATATCTT GAT CGC CAGCT TG GCT CTGGGAGACCT GCT G 684 

M | I II I I I I II I M I I I I 1 I I I I I I I I I I M I 

809 AAATGTATGAGGAATGGCCCCAACGCGCTGATAGCCAGTCTTGCCCTTGGAGACCTTATC 868 



685 



869 



741 



929 



740 



CACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCTGGCAGAGGACTGGCC 

| | || | M I II I M I M I II M I I I I II I M I I I I I I I I I I I M 
TATGTGGTCATTGATCTCCCTATCAATGTATTTAAGCTGCTGGCTGGGCGCTGGCCTTTT 

ATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTTC AT ACAGAAAGCCTCC 

M I II I I II I I I II I I II I I I II I II I I M 

GATCACAATGACTTTGGCGTATTTCTTTGCAAGCTGTTCCCCTTTTTGCAGAAGTCCTCG 



7 90 GT GGGAAT CACT GT GCT GAGT CT AT GT GCT CT GAGTATT GACAGATAT CGAGCT GTTGCT 

| | || | M M I M M I M I I I I I I I I I I I I I I M I II M M I I I II 

989 GTGGGGATCACCGTCCTCAACCTCTGCGCTCTTAGTGTTGACAGGTACAGAGCAGTTGCC 

850 T CT T GGAGT AGAAT T AAAGGAAT T GG G GTT C CAAAAT G GACAGC AGT AGAAAT T GT TT T G 

|| || || M I I I I I II I I M I I II M I I I I I I M M I I I I 

1049 TCCT GGAGT CGT GTT CAGGGAATTGGGATTCCTTT GGT AACTGCCATTGAAATTGTCTCC 

910 ATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACG 

|| | | | | I I I I I II I M I II I I I I I I I I I I M I M I I 
110 9 ATCTGGATCCTGTCCTTTATCCTGGCCATTCCTGAAGCGATTGGCTTCGTCATGGTACCC 



928 



789 



988 



849 



1048 



909 



1108 



969 



1168 



Qv 970 AT G GAC T AC AAAGGAAGT TAT CT GC GAAT CT GCT T GCT T CAT C C C GT T CAGAAGACAG CT 1029 

| | | | | | 1 I I I I I II 

Db 1169 T T T GAAT AT AGG GGT GAAC AG C AT AAAACCT GT AT GCT C AAT GC C ACATCAAAA 1222 

Qy 1030 TTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTG 108 9 

Mill | M I I I I II II 

Db 1223 TTCATGGAGTTCTACCAAGATGTAAAGGACTGGTGGCTCTTCGGGTTCTATTTCTGTATG 1282 

q v 1090 C C ATT G G C CAT C ACT G CAT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T G AGAAAG 1146 

| | | | | | | I I I I I I I I I I I I I I I I I I I I M Ml 

Db 12 83 C C CT T GGT GT GCACT GC GAT CTTCTACACCCT CAT GACTTGTGAGATGTT GAAC AGAAGG 1342 

Qy H47 AAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAA 1206 

|| | || M I I I M I I I II I I I I I I I I I I II 

Db 1343 AAT GGC AG CT T GAGAAT T G C C CT C AGT GAAC AT CT T AAGC AGC GT C GAGAAGT GG CAAAA 14 02 

Q V 1207 ACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGG 1266 

| | | | | | M | I I M I I II I II I I I I I I I I I I I II I I 

Db 14 03 ACAGTTTTCTGCTTGGTTGTAATTTTTGCTCTTTGCTGGTTCCCTCTTCACTTAAGCCGT 14 62 

Qv 12 67 AT T CT GAAGCT C ACT CT T TAT AAT C AG AAT GAT C C CAAT AGAT GT GAAC T T T T GAG CT T T 132 6 

|| I M I NNI MIM 

Db 14 63 AT AT T GAAGAAAACT GT GT AT AAC GAAAT G GACAAGAAC C GAT GT GAAT T ACT T AGT TT C 1522 

Q V 1327 CTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCA 1386 

| Ml || I I I I I I I I I II I I I I M M I I M I I I I M 

Db 1523 T TACT GCT CAT G GAT T ACAT C GGT AT T AACT T GGCAAC CAT GAAT T CAT GT ATAAAC C C C 1582 

Q V 1387 ATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGC 1446 

I I I I I II I I I I I M II I I I Ml' 1 Ml 

Db 1583 ATAGCTCTGTATTTTGTGAGCAAGAAATTTAAAAATTGTTTCCAGTCATGCCTCTGCTGC 1642 

Qy 1447 TGGTG 1451 

I I I I 

Db 1643 TGCTG 1647 



RESULT 7 

PCT-US92-02091-1 

; Sequence 1, Application PC/TUS9202091 

; GENERAL INFORMATION: 

; APPLICANT: Battey Jr., James F. 

; APPLICANT: Cor jay, Martha H. 

; APPLICANT: Feldman, Richard I. 

; APPLICANT: Harkins, Richard N. 

; TITLE OF INVENTION: RECEPTORS FOR BOMBES IN-LIKE PEPTIDES 

NUMBER OF SEQUENCES: 8 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Edwin P. Ching 

STREET: 1501 Harbor Bay Parkway 
; CITY: Alameda 

; STATE: CA 

COUNTRY: USA 
; ZIP: 94501 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 



OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US92/02091 
FILING DATE: 19920313 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/426,150 
FILING DATE: 24-OCT-1989 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/533,659 
FILING DATE: 05-JUN-1990 
ATTORNEY/ AGENT INFORMATION: 
NAME: Ching, Edwin P. 
REGISTRATION NUMBER: 34 090 
REFERENCE/DOCKET NUMBER: A-0092C 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-2 66-7476 
TELEFAX: 415-2 66-7400 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1700 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to mRNA 
HYPOTHETICAL: NO 
ORIGINAL SOURCE: 

ORGANISM: Mus mus cuius 
CELL LINE: Swiss 3T3 
IMMEDIATE SOURCE: 

LIBRARY: Lambda GT10 
FEATURE : 

NAME/KEY: CDS 
LOCATION: 378. .1532 
PCT-US92-02091-1 

Query Match 3.1%; Score 132.2; DB 5; Length 1700; 

Best Local Similarity 56.9%; Pred. No. 6.4e-23; 

Matches 242; Conservative 0; Mismatches 183; Indels 0; Gaps 0; 

Qy 535 TTCAAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAAC 594 

|| I I I I I III Mill I I I I I I I I II I 

Db 495 TTCATCTATGTCATCCCTGCAGTTTATGGGCTTATCATCGTGATAGGTCTTATTGGCAAC 554 

Qy 595 T C C ACAC T T CT GAGAATT AT CT ACAAGAACAAGT GC AT G C GAAAC G GT C C CAAT AT CTT G 654 

| | I I I II II I I I I I I I II I I I I I I I I I I I I I M I I IN 
Db 555 AT C AC G CT CAT CAAGAT CT T CT G CAC GGT CAAGT C CAT GC GAAAC GT G C CAAAC CT GT T C 614 

Q y 655 ATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTC 714 

| | | | I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I 

Db 615 ATCTCTAGCCTGGCTTTGGGAGACCTGCTGCTGCTGGTGACATGCGCCCCTGTGGATGCC 674 

Qy 715 TACAAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTC 774 

| | M I I II I II I I I I I I I I I I I I II I M I I I I I 

Db 675 AGCAAGTACCTGGCTGACAGGTGGCTATTTGGCAGAATTGGCTGCAAACTGATCCCCTTT 734 



Qy 775 AT ACAGAAAGC CT C C GT G GGAAT C ACT GT GCT GAGT C TAT GT GC T CT GAGT AT T GAC AGA 834 

1 | I I | I I I I I I I I I M II II I I I M I I 

Db 735 ATACAACTTACTTCAGTGGGGGTGTCTGTCTTCACACTTACGGCACTGTCAGCTGACAGG 794 

Qy 835 TATCGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCA 894 

M Ml I I I I II I I I I I I M 

Db 795 TACAAAGCCATTGTACGGCCAATGGATATCCAGGCATCCCATGCCCTGATGAAGATCTGT 854 

Qy 895 GTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGT 954 

| || | | | I | II I I II I II I I I I II I I I I I I I I I I I M I I 
Db 855 CTCAAAGCTGCTTTGATCTGGATTGTCTCTATGTTGTTGGCCATCCCAGAGGCTGTGTTT 914 

Qy 955 TTTGA 959 

I I I I 

Db 915 TCTGA 919 



RESULT 8 

US-08-724-394A-20 

Sequence 20, Application US/08724394A 
Patent No. 5872237 
GENERAL INFORMATION: 

APPLICANT: Feder, John N. 
APPLICANT: Kronmal, Gregory S. 
APPLICANT: Lauer, Peter M. 
APPLICANT: Ruddy, David A. 
APPLICANT: Thomas, Winston 
APPLICANT: Tsuchihashi, Zenta 
APPLICANT: Wolff, Roger K. 

TITLE OF INVENTION: Megabase Transcript Map: No. 5872237el 
TITLE OF INVENTION: Sequences and Antibodies Thereto 
NUMBER OF SEQUENCES: 31 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: TOWNSEND and TOWNSEND and CREW LLP 
STREET: Two Embarcadero Center, 8th Floor 
CITY: San Francisco 
STATE: CA 
COUNTRY: USA 
ZIP: 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/724, 394A 
FILING DATE: 01-OCT-1996 
CLASSIFICATION: 536 
ATTORNEY/AGENT INFORMATION: 
NAME: Fitts, Renee A. 
REGISTRATION NUMBER: 35,136 
REFERENCE/DOCKET NUMBER: 017 957-000100 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-576-02 00 
TELEFAX : 415-57 6-0300 
INFORMATION FOR SEQ ID NO: 20: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 246240 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/ KEY : misc_feature 
LOCATION: 1. .246240 

OTHER INFORMATION: /note= "HLA-H . CONTIG" 
US-08-724-394A-20 

Query Match 2.7%; Score 114.6; DB 2; Length 246240; 

Best Local Similarity 82.0%; Pred. No. 2.5e-17; 

Matches 132; Conservative 0; Mismatches 29; Indels 0; Gaps 0; 

5 CATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCCAGGT 64 

| | | I I I I II I I I I I II I I I I I I I I MM M M I I I M I 

Db 180691 CATCCCTACGGGGAACTCCAGCCAGTTTGAGCGACACAGATCTGGAGAGCGCTCCCAGGT 

180750 

Qv 65 AGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGGAGGA 124 

I I I I I || I I M Mill I I II I I II I II M I M I M I I II 

Db 180751 AGGCAATTGCCCCGGTGGAACGCCTCACCAGAGCAGCACGTGGCAGGCCCTCGTGGAGGA 

180810 

Qy 125 TCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGG 165 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I 
Db 18 0811 TCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTTG 18 0851 



Qy 



RESULT 9 

US-08-724-394A-21 

Sequence 21, Application US/08724394A 
Patent No. 5872237 
GENERAL INFORMATION: 

APPLICANT: Feder, John N. 
APPLICANT: Kronmal, Gregory S. 
APPLICANT: Lauer, Peter M. 
APPLICANT: Ruddy, David A. 
APPLICANT: Thomas, Winston 
APPLICANT: Tsuchihashi, Zenta 
APPLICANT: Wolff, Roger K. 

TITLE OF INVENTION: Megabase Transcript Map: No. 5872237el 
TITLE OF INVENTION: Sequences and Antibodies Thereto 
NUMBER OF SEQUENCES: 31 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: TOWNSEND and TOWNSEND and CREW LLP 
STREET: Two Embarcadero Center, 8th Floor 
CITY: San Francisco 
STATE: CA 
COUNTRY: USA 
ZIP: 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 



CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/724 , 394A 
FILING DATE: 01-OCT-1996 
CLASSIFICATION: 536 
ATTORNEY/ AGENT INFORMATION: 
NAME: Fitts, Renee A. 
REGISTRATION NUMBER: 35,136 
REFERENCE/ DOCKET NUMBER: 017957-000100 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-576-0200 
TELEFAX: 415-576-0300 
INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 246240 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY : misc_feature 
LOCATION: 1. .246240 

OTHER INFORMATION: /note= "HLA-H . CONTIG" 
US-08-724-394A-21 

Query Match 2.7%; Score 114.6; DB 2; Length 246240; 

Best Local Similarity 82.0%; Pred. No. 2.5e-17; 

Matches 132; Conservative 0; Mismatches 29; Indels 0; Gaps 



Qy 5 

Db 180691 
180750 



CATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCCAGGT 64 

HIM | | | I I I I I Mill MM II MM MUM MM 

CAT C C CT AC G G GGAACT C CAG C CAGT T T GAGC GAC ACAGAT CT GGAGAGC GCT C C C AGGT 



Ov 65 AGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGGAGGA 124 

Ml I II II II I I I I M I M I M I I M M II I I M M I M I II I I M 

AGGCAATTGCCCCGGTGGAACGCCTCACCAGAGCAGCACGTGGCAGGCCCTCGTGGAGGA 



Db 180751 
180810 



Qy 125 TCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGG 165 

MIM I II II II II I I I M M M M I M I II 

Db 180811 TCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTTG 180851 



RESULT 10 
US-08-724-394A-22 

Sequence 22, Application US/08724394A 
Patent No. 5872237 
GENERAL INFORMATION: 

APPLICANT: Feder, John N. 
APPLICANT: Kronmal, Gregory S. 
APPLICANT: Lauer, Peter M. 
APPLICANT: Ruddy, David A, 
APPLICANT: Thomas, Winston 
APPLICANT: Tsuchihashi, Zenta 
APPLICANT: Wolff, Roger K. 



TITLE OF INVENTION: 
TITLE OF INVENTION: 



Megabase Transcript Map: No. 5872237el 
Sequences and Antibodies Thereto 



NUMBER OF SEQUENCES: 31 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: TOWNSEND and TOWN SEND and CREW LLP 
STREET: Two Embarcadero Center, 8th Floor 
CITY: San Francisco 
STATE: CA 
COUNTRY: USA 
ZIP: 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 724 , 394A 
FILING DATE: 01-OCT-1996 
CLASSIFICATION: 536 
ATTORNEY/AGENT INFORMATION: 
NAME: Fitts, Renee A. 
REGISTRATION NUMBER: 35,136 
REFERENCE/ DOCKET NUMBER: 017957-000100 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-576-020 0 
TELEFAX : 415-576-0300 
INFORMATION FOR SEQ ID NO: 22: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 246240 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: 1. .246240 

OTHER INFORMATION: /note= "HLA-H . CONTIG" 
US-08-724-394A-22 

Query Match 2.7%; Score 114.6; DB 2; Length 246240; 

Best Local Similarity 82.0%; Pred. No. 2.5e-17; 

Matches 132; Conservative 0; Mismatches 29; Indels 0; Gaps 0; 
0 5 CATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCCAGGT 64 

I I I II I M I I I I I I I i MM M II I I M I II I II I 

Db 180691 CAT C C CT AC GG GGAACT C C AG C C AGT T T GAGC GAC AC AGAT C T GGAGAGC GCT C C C AGGT 

180750 

65 AGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGGAGGA 124 

| | | | I MINIMUM II II II MM II III Ml I MM I I MIM 

Db 180751 AGGCAATTGCCCCGGTGGAACGCCTCACCAGAGCAGCACGTGGCAGGCCCTCGTGGAGGA 

180810 

Qy 125 T C AACACAGT GGCT GAACACT GGGAAGGAACT GGT ACT T GG 165 

Mill I I II I I II M I I II MIM MM I 

Db 180 811 T CAAC G CAGT GG C T GAACAC C GG GAAG GAAC T GGCACT T T G 180851 



RESULT 11 



PCT-US92-02091-5 

Sequence 5, Application PC/TUS9202091 
GENERAL INFORMATION: 

APPLICANT: Battey Jr., James F. 
APPLICANT: Cor jay, Martha H. 
APPLICANT: Feldman, Richard I. 
APPLICANT: Harkins, Richard N. 

TITLE OF INVENTION: RECEPTORS FOR BOMBES IN-LIKE PEPTIDES 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Edwin P. Ching 
STREET: 1501 Harbor Bay Parkway 
CITY: Alameda 
STATE: CA 
COUNTRY: USA 
ZIP: 94501 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US92/02091 
FILING DATE: 19920313 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/426,150 
FILING DATE: 24-OCT-1989 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/533,659 
FILING DATE: 05-JUN-1990 
ATTORNEY/AGENT INFORMATION: 
NAME : Ching, Edwin P. 
REGISTRATION NUMBER: 34090 
REFERENCE/ DOCKET NUMBER: A-0092C 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-266-7476 
TELEFAX: 415-266-7400 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1584 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to mRNA 
HYPOTHETICAL: NO 
ORIGINAL SOURCE: 

ORGANISM: Rattus rattus 
TISSUE TYPE: Esophagus 
FEATURE: 

NAME/ KEY: CDS 
LOCATION: 132 . . 1304 
PCT-US92-02091-5 

Query Match 2.5%; Score 108.8; DB 5; Length 1584; 

Best Local Similarity 54.8%; Pred. No. 4.1e-17; 

Matches 215; Conservative 0; Mismatches 177; Indels 0; Gaps 0; 



Qy 572 T C GT GC T G GGGAT CAT C GG GAACT C C AC ACT T C T GAGAAT TAT CT ACAAGAAC AAGT G C A 631 

|| INN I I II I 1 N II >>ll II 

Db 2 92 TCTCGGTGGGCTTGCT GGG CAACAT CAT G CT GGT GAAGAT AT T C C T CAC C AAC AGCAC C A 351 

Qy 632 TGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCG 691 

1 1 1 1 i i iiiii i mi i 

Db 352 TGCGGAGTGTCCC CAACAT CTT CAT CTCTAACCTGGCTGCGGGAGACCTGCTGCTGCTGC 411 

Q y 692 T CAT T GAC AT C C CT AT CAAT GT C T AC AAGC T G CT GGC AGAG GACT GGC CAT TT GGAG CT G 751 

|| I I I I I I I I I I I I I I I I I I I I MM 

Db 412 TGACCTGCGTCCCAGTGGATGCCTCCCGATACTTCTTTGATGAATGGGTGTTCGGCAAGC 471 

Qy 752 AGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTC 811 

| || IIIII I I I I I I I I I I I I I I I I M 

Db 472 TGGGCTGCAAACTCATCCCAGCCATCCAGCTCACCTCGGTGGGGGTTTCCGTGTTCACTC 531 

Qy 812 TAT GT GCT CT GAGT AT T GACAGATAT CGAGCT GTT GCTT CTT GGAGT AGAATTAAAGGAA 8 71 

I II I I I I I I I I I I I I M I I I I I I II I 

Db 532 T CAC G GC C CT CAGCGCT GACAGGT AC AGAGCT AT C GT GAAC C C CAT GGAC AT GCAGAC GT 591 

Oy 872 TTGGGGTTCCAAAATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTC 931 

| | | I I I I I I I I I M M I I I I I M I I II II I I I 

Db 592 CT GGT GT GGT GCT GT GGAC CAGTTTGAAGGCCGTGGGCATCTGGGT GGT CTCTGT GCT GT 651 

Qy 932 TGGCTGTCCCTGAAGCCATAGGTTTTGATATA 963 

I I I I I I I I I I I II II I II M M 
Db 652 TGGCTGTCCCTGAGGCTGTGTTTTCGGAAGTA 683 



RESULT 12 

US- 09-175- 658B-25/c 

; Sequence 25, Application US/09175658B 

; Patent No. 6372900 

; GENERAL INFORMATION: 

; APPLICANT: MET ALL I NO S, DANIKA 

; APPLICANT: RINE, JASPER 

; APPLICANT: BOWLING, ANN 

; TITLE OF INVENTION: HORSE ENDOTHELIN-B RECEPTOR GENE AND GENE PRODUCTS 
; FILE REFERENCE: GOBR-110 

; CURRENT APPLICATION NUMBER: US/ 09/ 175 , 658B 

; CURRENT FILING DATE: 1998-10-20 

; PRIOR APPLICATION NUMBER: 60/062,562 

; PRIOR FILING DATE: 1997-10-21 

; NUMBER OF SEQ ID NOS : 25 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 25 

LENGTH: 801 
; TYPE: DNA 
; ORGANISM: Horse 

FEATURE: 

OTHER INFORMATION: Uncertain of the nucleotide sequence at positions 
; OTHER INFORMATION: 30, 54, 286, 436, 445, 542, 614, 617, 624, 641, 
; OTHER INFORMATION: 731, 746, 753, 770, 775 and 793. 
US-09-175-658B-25 

Query Match 2.5%; Score 107.2; DB 4; Length 801; 



Best Local Similarity 86.8%; Pred. No. 7e-17; 

Matches 118; Conservative 0; Mismatches 18; Indels 0; Gaps 0; 

1408 AAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAA 1467 
| | | | | | | | I | | | | | I I I I I I I I I I I M I I I I M I I II I I I I I I I I 

> 223 AAACGAGTTATTTGTTTTGTACAGTCGTGCTTATGCTGCTGGTGCCAATCATTTGAAGAA 164 

7 14 68 AAACAGT CCT T GGAGGAAAAGCAGT C GT GCTTAAAGTT CAAAGCTAAT GAT CACGGAT AT 1527 

I I I I MINIM I I 1 I t I 1 I 1 I 1 I I I I Ml 

> 163 AAACAGT C CT T G GAAGAC AAGC AGT CAT GCTT AAAGT T CAAAG CT AAT GAT C AC GGAT AT 104 

r 1528 GACAACTTCCGTTCCA 1543 

I I M I I II I I I I I I I I 
5 103 GACAACTTCCGTTCCA 88 



RESULT 13 
US-09-120-772-1 

; Sequence 1, Application US/09120772 
; Patent No. 6143521 

GENERAL INFORMATION: 
; APPLICANT: LANE, PAMELA 

APPLICANT: TSUI, PING 
; APPLICANT: ELSHOURBAGY, NABIL 

TITLE OF INVENTION: HUMAN BOMBESIN RECEPTOR SUBTYPE 
; TITLE OF INVENTION: 3 
; NUMBER OF SEQUENCES : 2 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Ratner & Prestia 

; STREET: P.O. Box 980 

; CITY: Valley Forge 

; STATE: PA 

COUNTRY: USA 
; ZIP: 19482 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/120, 772 
FILING DATE: 22-JUL-1998 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Prestia, Paul F 

REGISTRATION NUMBER: 23,031 
; REFERENCE/ DOCKET NUMBER: GP-7 0505 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-407-0700 
; TELEFAX: 610-407-0700 

; TELEX: 846169 

; INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1205 base pairs 



TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-09-120-772-1 

Query Match 2.5%; Score 106.6; DB 3; Length 1205; 

Best Local Similarity 48.5%; Pred. No. 1.2e-16; 

Matches 425; Conservative 0; Mismatches 434; Indels 18; Gaps 4; 

Ov 578 TGGGGATCATCGGGAACTCCACACTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAA 637 

Ml | | II II I I I I I I I I I I I I I I I I 

Db 181 TGGGCATCCTTGGAAATGCTATTCTCATCAAAGTCTTTTTCAAGACCAAATCCATGCAAA 240 

Qy 638 ACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTG 697 

| | | | | | | I I I I I I I I M I I I I I I I I I I I I I I I II I ''I 
Db 241 CAGTTCCAAATATTTTCATCACCAGCCTGGCTTTTGGAGATCTTTTACTTCTGCTAACTT 300 

Ov 698 ACATCCCTATCAATGTCTACAAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGT 757 

| | | | I I I I II I I I I I I I I M I 

Db 301 GTGTGCCAGTGGATGCAACTCACTACCTTGCAGAAGGATGGCTGTTCGGAAGAATTGGTT 360 

Qy 758 GTAAGCTGGTGCCTTTCATACAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTG 817 

M I I I II I I I I I I I I I I I I I I I I I I I' 

Db 361 GTAAGGTGCTCTCTTTCATCCGGCTCACTTCTGTTGGTGTGTCAGTGTTCACATTAGCAA 420 

Qy 818 CTCTGAGTATTGACAGATATCGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGG 877 

I I I I I I I I I I I I I I I I I I I I I I 1 

Db 421 TTCTCAGCGCTGACAGATACAAGGCAGTTGTGAAGCCACTTGAGCGACAGCCCTCCAATG 480 

Ov 878 TTCCAAAATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTG 937 

HI | | | II II I I III I II I I I M I I IN 

Db 481 CCATCCTGAAGACTTGTGTAAAAGCTGGCTGCGTCTGGATCGTGTCTATGATATTTGCTC 540 

Qy 938 TCCCTGAAGCCATAGGTTTTGATATA ATTACGATGGACTACAAAGGAAGTTATCT 992 

I Mill M I M Ml I I H I I I ' n 

Db 541 TACCTGAGGCTATATTTTCAAATGTATACACTTTTCGAGATCCCAATAAAAATATGACAT 600 

Qy 993 GCGAATCTGCTTGCT-TCATCCCGTTCAGAAGACAGCTTTCATGCAGTTTTACAAGACAG 1051 

Mill I M I MM II I I M II III 

Db 601 TTGAATCATGTACCTCTTATCCTGTCTCTAAGAAGCTCTTGCAAGAAATACATTCTCTGC 660 

Qv 1052 CAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCCATTGGCCATCACTGCATTTT 1111 

| I I I II I I Ml I 'I 

Db 661 TGTGCTTCTTAGTGTTCTACATTATTCCACTCTCTATTATCTCTGTCTACTATTCCTTGA 720 

QV 1112 TTTATACACTAATGACCTGTGAAATGTTGAGAAAGAAAAGTGGCATGCAGATTGCTTTAA 1171 

| | M II Ill 111 1 

Db 721 TTGCTAGGACCCTTTACAAAAGCACCCTGAACATACCTACTGAGGAACAAAGCCATGCCC 7 80 

Qy 1172 ATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCT 1231 

| | | | | | M I III I I 

Db 781 GTAAGCAGATTGAATCCCGAAAGAGAATTGCCAGAACGGTATTGGTGTTGGTGGCTCTGT 840 

Qy 1232 TTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCTCACTCTTTATAATC 1291 

I I I I I II I I II I I I II II I I I I I I I IN 

Db 841 TTGCCCTCTGCTGGTTGCCAAATCACCTC CTGTACCTCTACCATTCATTCA 891 



Qy 


1292 


AGAAT GAT C C CAAT AGAT GT GAAC T T T T GAGCT T T CT GT T G GT AT T GGACT AT AT T GGT A 
, . . . ■ iii i iii i i iii i i i I 
| | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 M III 1 
CTTCTCAAACCTATGTA GACCCCTCTGCCATGCATTTCATTTTCACCATTTTCTCTC 


1351 


Db 


892 


948 


Qy 


1352 


T CAACAT GG CT T C ACT GAAT T C CT G CAT T AAC C CAAT T GCT CT GT AT T T G GT GAG CAAAA 

| | | | | | 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 
GGGTTTTGGCTTTCAGCAATTCTTGCGTAAACCCCTTTGCTCTCTACTGGCTGAGCAAAA 


1411 


Db 


949 


1008 


Ov 


1412 


GATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTG 14 48 

I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCTT CCAGAAGCATTTTAAAGCTCAGTT GTT CT GTTG 1045 




Db 


1009 




RESULT 
US-09- 


14 
016-434 


-1275 





Sequence 1275, Application US/09016434 
Patent No. 6500938 
GENERAL INFORMATION: 

APPLICANT: Janice Au- Young 

APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
TITLE OF INVENTION: PATHWAY GENE EXPRESSION 
NUMBER OF SEQUENCES: 1490 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

STREET: 3174 PORTER DRIVE 

CITY: PALO ALTO 

STATE: CALIFORNIA 

COUNTRY: USA 

ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 016, 434 

FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 

REFERENCE/ DOCKET NUMBER: PA- 0002 US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 

TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 1275: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1413 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 



IMMEDIATE SOURCE: 
LIBRARY: GEN BANK 
CLONE: g291876 
US-09-016-434-1275 

Query Match 2.5%; Score 106.6; DB 4; Length 1413; 

Best Local Similarity 48.5%; Pred. No. 1.4e-16; 

Matches 425; Conservative 0; Mismatches 434; Indels 18; Gaps 4; 
Q V 578 T GGGGAT CAT C GG GAACT CC AC ACT T C T GAGAAT TAT CT ACAAGAAC AAGT G CAT GC GAA 637 

11 I II II mi 1 1 i ii 

Db 328 T GGG C AT C CT T GGAAAT GCT ATT CT CAT CAAAGT CTT T T T CAAGAC CAAAT C C AT GCAAA 387 

Qy 638 ACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTG 697 

I Ml I M M II III I I I I I I I I I I I I I I I I II II I Ml 
Db 388 CAGTTCCAAATATTTTCATCACCAGCCTGGCTTTTGGAGATCTTTTACTTCTGCTAACTT 447 

Qy 698 ACATCCCTATCAATGTCTACAAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGT 757 

II I I I I I I I I I I I I I M I II I 

Db 448 GTGTGCCAGTGGATGCAACTCACTACCTTGCAGAAGGATGGCTGTTCGGAAGAATTGGTT 507 

Qy 758 GTAAGCTGGTGCCTTTCATACAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTG 817 

| | | I I I I I M I I I M I I I I I Ml 

Db 508 GTAAGGTGCTCTCTTTCATCCGGCTCACTTCTGTTGGTGTGTCAGTGTTCACATTAACAA 567 

Qy 818 CT CT GAGT ATT GACAGAT AT CGAGCT GTT GCTT CTT GGAGTAGAATTAAAGGAATT GGGG 877 

I I I I I I I I M I I II I I I I I I I I I 

Db 568 T T CT C AGC GCT GACAGAT ACAAG G C AGTT GT GAAG C C ACT T GAGC GAC AGC C CT C CAAT G 627 

Qv 878 TTCCAAAATGGACAGCAGTAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTG 937 

Ml I I I I I II I I I I I I II I I I I I I I IN 

Db 628 CCATCCTGAAGACTTGTGTAAAAGCTGGCTGCGTCTGGATCGTGTCTATGATATTTGCTC 687 

Q y 938 T C CCT GAAGC CAT AGGT T T T GAT AT A ATT AC GAT GGACT ACAAAGGAAGTT AT CT 992 

II I II I I I I II I N II M I M I 

Db 68 8 T ACCT GAGGCTATATTTT CAAAT GT AT ACACT TTT CGAGAT CCCAATAAAAAT AT GACAT 747 

Q y 993 G C GAAT CTGCTTGCT-T CAT C C C GT T C AGAAGACAGCT T T CAT G CAGTT TT ACAAGAC AG 1051 

| I I I I I I I I I I I I I I MM II I I I 

Db 748 T T GAAT CAT GT AC CT CT T AT C C T GT CT CT AAGAAG CT CT T G CAAGAAAT AC AT T CT CT GC 807 

Qy 1052 CAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCCATTGGCCATCACTGCATTTT 1111 

I I I I I I I I III I I I I I I I I I 

Db 8 08 TGTGCTTCTTAGTGTTCTACATTATTCCACTCTCTATTATCTCTGTCTACTATTCCTTGA 867 

Qy 1112 T T T AT AC ACTAAT GAC CT GT GAAAT GT T GAGAAAGAAAAGT G G CAT GC AGAT T GCT T T AA 1171 

| | | I II I II I I I I I M I I 

Db 8 68 T T G CT AGGAC C CT T T ACAAAAG C AC CCT GAACAT AC CT ACT GAGGAACAAAG C CAT GC C C 927 

Qy 1172 ATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCT 1231 

I II I I I I I Ml 

Db 928 GTAAGCAGATT GAAT CCCGAAAGAGAATTGCCAGAACGGTATTGGT GTT GGTGGCTCTGT 987 

Qy 1232 TTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCTCACTCTTTATAATC 1291 

II I I I M I I I I M I III 

Db 988 TTGCCCTCTGCTGGTTGCCAAATCACCTC CTGTACCTCTACCATTCATTCA 1038 



1292 AGAAT GAT C C C AAT AGAT GT GAACT T T T GAGCT T T CT GT T G GT AT T GGACT AT AT T GGT A 1351 

|| I I II I III I I I I I I i I 1 III I 

1039 CTTCTCAAACCTATGTA— GACCCCTCTGCCATGCATTTCATTTTCACCATTTTCTCTC 1095 

1352 TCAACATGGCTTCACTGAATTCCTGCATTAACCCAATTGCTCTGTATTTGGTGAGCAAAA 1411 

| | | | | I I I I M I I I I I M I I I I I I I M I I I I I I I 

1096 GGGTTTTGGCTTTCAGCAATTCTTGCGTAAACCCCTTTGCTCTCTACTGGCTGAGCAAAA 1155 

1412 GATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTG 1448 

| I I I I I I I I I II I I I I I I I I I 

1156 GCTTCCAGAAGCATTTTAAAGCTCAGTTGTTCTGTTG 1192 



RESULT 15 

US-09-016-434-1215 

; Sequence 1215, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

; APPLICANT: Jeffrey J. Seilhamer 

; TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 

; TITLE OF INVENTION: PATHWAY GENE EXPRESSION 

; NUMBER OF SEQUENCES: 14 90 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: INCYTE PHARMACEUTICALS , INC. 

STREET: 3174 PORTER DRIVE 
; CITY: PALO ALTO 

STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows /MS-DOS 6.2 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/016, 434 
; FILING DATE: HEREWITH 

; CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 
FILING DATE: 
CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Zeller, Karen J. 

; REGISTRATION NUMBER: 37,071 

REFERENCE/DOCKET NUMBER: PA-0002 US 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 
; TELEFAX: (650) 845-4166 

; INFORMATION FOR SEQ ID NO: 1215: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1726 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 



LI BRARY : GENBANK 
CLONE: gl83649 
US-09-016-434-1215 

Query Match 2.5%; Score 106; DB 4; Length 1726; 

Best Local Similarity 52.0%; Pred. No. 2.1e-16; 

Matches 238; Conservative 0; Mismatches 220; Indels 0; Gaps 0; 

Qy 536 TCAAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACT 595 

| I IMM I I M II I I I 

Db 514 TCCTCTATGTCATCCCTGCAGTTTATGGGGTTATCATTCTGATAGGCCTCATTGGCAACA 573 

Q y 596 CCACACT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT G C GAAAC GGT C C CAAT AT CT T GA 655 

Ml | M II IN I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 574 T CACT T T GAT CAAGAT C T T C T GT AC AGT CAAGT C CAT G C GAAAC GT T C CAAAC CT GT T C A 633 

Qy 656 T C G C C AGCT T GGCTCTGG GAGAC CT GCT GC ACAT C GT CAT T GACAT C C CT AT CAAT GT CT 715 

I I I I I I I I I I I I II I I I I I I I I I I I I M II I I I I I 

Db 634 TTTCCAGTCTGGCTTTGGGAGACCTGCTCCTCCTAATAACGTGTGCTCCAGTGGATGCCA 693 

Qy 716 ACAAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCA 77 5 

Ml I I II I I I i I I I I I I M I I I I I I 

Db 694 GCAGGTACCTGGCTGACAGATGGCTATTTGGCAGGATTGGCTGCAAACTGATCCCCTTTA 753 

Qy 77 6 T ACAGAAAGC CT C C GT G G GAAT CACT GT GCT GAGT CT AT GT GCT CT GAGT AT T GAC AGAT 835 

Mill I I I I II II I I I I I I I II M I I M 

Db 754 TACAGCTTACCTCTGTTGGGGTGTCTGTCTTCACACTCACGGCGCTCTCGGCAGACAGAT 813 

Qy 836 ATCGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAG 895 

I I M I I III M 

Db 814 ACAAAGC CAT T GT C C GGC C AAT GGAT AT C C AGGC CT C C CAT G C C CT GAT GAAGAT CT GC C 87 3 

Qy 8 96 TAGAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTT 955 

| || | || || III I MM II I II II I I M M II I I M 
Db 874 TCAAAGCCGCCTTTATCT GGAT CAT CTC CAT GCT GCT GGCCATTCCAGAGGCCGTGTTTT 933 

Qy 956 T T GAT AT AAT T AC GAT GGACT ACAAAGGAAGTT AT CT G 993 

I I I I I I I I I II I I Ml 
Db 934 CT GAC CT C CAT C C CT T CCAT GAGGAAAGC AC CAAC C AG 971 



Search completed: May 14, 2004, 15:54:40 
Job time : 274.574 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: May 14, 2004, 10:14:36 ; Search time 1634.24 Seconds 

(without alignments) 
11943.281 Million cell updates/sec 



Title: US-09-931-157-2 

Perfect score : 4301 
Sequence : 

Scoring table: 



1 gagacattccggtgggggac ctgggaaaaaaaaaaaaaaa 4301 

IDENTITY_NUC 
Gapop 10.0 , Gapext 1.0 



Searched: 2947324 seqs, 2269024515 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



5894648 



Database : 



Published__Applications__NA: 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 



/cgn2_6/ptodata/2/pubpna/US07_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB . seq: * 
/cgn2_6/ptodata/2/pubpna/US06_NEW_PUB. seq: * 
/cgn2_6/ptodata/2/pubpna/US06_PUBCOMB.seq:* 
/cgn2__6/ptodata/2/pubpna/US07_NEW_PUB. seq: * 
/cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/US08_NEW_PUB.seq:* 
/cgn2_6/ptodata/2/pubpna/US08_PUBCOMB. seq: * 
/cgn2_6/ptodata/2/pubpna/US09A_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB.seq:* 
/cgn2__6/ptodata/2/pubpna/US09C_PUBCOMB.seq: + 
/cgn2_6/ptodata/2/pubpna/US09_NEW_PUB. seq: * 
/cgn2_6/ptodata/2/pubpna/US09__NEW_PUB . seq2 : * 
/cgn2_6/ptodata/2/pubpna/USl0A_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/US10C_PUBCOMB.seq:* 
/cgn2_6/ptodata/2/pubpna/USl0__NEW_PUB.seq:* 
/cgn2_6/ptodata/2/pubpna/US60__NEW__PUB . seq: * 
/cgn2_6/ptodata/2/pubpna/US60_PUBCOMB.seq:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-931-157-2 

; Sequence 2, Application US/09931157 
; Patent No. US20020082414A1 
; GENERAL INFORMATION: 



; APPLICANT : Imura, Hiroo 

; APPLICANT: Nakao, Kazuwa 

; APPLICANT: Nakanishi, Shigetada 

; TITLE OF INVENTION: Human Endothelin Receptor 

; FILE REFERENCE: 299002032411 

; CURRENT APPLICATION NUMBER: US/09/931, 157 

; CURRENT FILING DATE: 2001-10-15 

; PRIOR APPLICATION NUMBER: 08/121,446 

; PRIOR FILING DATE: 1993-09-14 

; PRIOR APPLICATION NUMBER: 07/911,684 

; PRIOR FILING DATE: 1992-07-10 

; PRIOR APPLICATION NUMBER: JP 3-172828 

; PRIOR FILING DATE: 1991-07-12 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 4 301 
; TYPE: DNA 
; ORGANISM: Homo Sapiens 

FEATURE: 
; NAME/ KEY: CDS 

LOCATION: (238 )... (1566) 
US-09-931-157-2 



Query Match 100.0%; Score 4301; DB 9; Length 4301; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4301; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


l 


GAGACAT T CC G GT GG GG GAC T CT GG C CAGC C C GAG CAAC GT GGAT CC T GAGAGCACT CC C 

I I I I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1! 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 


60 


Db 


l 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

| | M 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 
AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Db 


61 


120 


Qy 
Db 


121 
121 


AGGAT CAACAC AGT G GCT GAACACT GGGAAG GAACT GGT AC TT GGAGT CT G GAC AT CT GA 

I | | | | | | 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 M 1 

AGGAT CAAC AC AGT GG CT GAACACT G G GAAGGAACT GGT ACT T G GAGT CT GGAC AT C T G A 


180 
180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

| || | | | | | | | | I I I I I II 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 1 1 I 1 M 1 1 1 1 
AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Db 


181 


240 


Qy 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

| | | | | | | I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 II 1 1 
CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 


Db 


241 


300 


Qy 


301 


TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

| | | | | 1 I I I I I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 M 1 1 1 1 II 1 M 1 II 
TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 


360 


Db 


301 


360 


Qy 


361 


AC C GCAGAGATAAT GAC G C C AC C C ACT AAG AC CT T AT GGC C CAAG GGT T C CAAC GC C AGT 

| | | || I I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 


420 


Db 


361 


420 


Qy 


421 


CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 


480 



Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 


Qy 


1021 


Db 


1021 


Qy 


1081 


Db 


1081 


Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 



| | | | | I I I I I I I I M I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 1 I I I I M I M I I I M I 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 4 80 
CCAC GCAC CAT CTCCCCTCCCCCGTGC CAAG GAC C CAT C GAGAT CAAGGAGACTT T CAAA 54 0 

| M | | I I I I I I M I I I I I I I I I I I I I I I I I I M I I M I I I I M I I I I I I I I I I I I I I M I 

C CAC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAG GAGACTT T CAAA 540 

TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

| | | | | | I I I I I I I I II I I I I I I I II I I II I I I I I I I I I M I M I I I I I I I M I I I I I I I I 
TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

CT T CT GAGAAT TAT C T ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C GC C 660 

I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I M I I I I I I I I I II I I I I I I I I I I I I I I 

C T T C T GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC G GT C C CAAT AT CT T GAT C G C C 660 

AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 
I I I I I I I I M I I I I I I I M I M I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I 
AG CT T G GCT CT GGGAGAC C T G CT G CAC AT C GT CAT T GAC AT C C C TAT CAAT GT CT ACAAG 720 

CTGCTGGCAGAGGACTGGCCATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTT CAT ACAG 7 80 

I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I 

CT GCT GGCAGAGGACTGGCCATTTGGAGCTGAGAT GTGTAAGCT GGT GCCTTT CAT ACAG 78 0 
AAAGC CT C CGT G GGAAT C ACT GT GC T GAGT CT AT GT GCT CT GAGT AT T GAC AGAT AT C GA 840 

I I I I I I I I I I M I I I I I II M I ! I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I 
AAAGC CT C C GT GGGAAT CACT GT G CT GAGT C TAT GT G CT CT GAGT ATT GAC AGAT AT C G A 84 0 

GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

I | | | I I I I I I I I I II I I I I I I I I I I I I I I I II I I I II I I M I I I I I I M I II I I I I I I I I 
GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I I || I I I I I I I I I I I I I I I I I I II I I I I M I I I I I II I I II II I I I I I I M I I I I I I II I 
ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

ATAAT TAC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T CAG 1020 

M I I I I I I I I I M I I I I I I I I I I I M II I II M I I I II I M I I I I I I I I I I I I I I 

ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 102 0 
AAGACAGCTTT CAT GCAGT TTT ACAAGACAGCAAAAGATT GGT GGCT GTT CAGTT T CTAT 1080 

I | I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

AAG ACAGCT T T CAT G CAGT T TT ACAAGACAGCAAAAGAT TGGTGGCTGTT C AGT T T C TAT 1080 

TTCTGCTTGCCATTGGC CAT CACT GCATTTTTTTATACACTAAT GAC CTGTGAAAT GTT G 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 
TTCTGCTTGCCATTGGCCAT CACT GCATTTTTTTATACACTAAT GAC CTGTGAAAT GTT G 1140 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 12 00 

I | I I I I I I I I I I I I II I II I I I I I I I I II I II I I II I II I II I I I I I II I I I I I I I I I I I 
AGAAAGAAAAGT GGC AT G C AGAT T G CT T T AAAT GAT C AC CTAAAGC AGAGAC G GGAAGT G 1200 

GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

I I I II I I I I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 
GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 12 60 



1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 
I I I I | | | I | I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I M I M I 



1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 

1321 AGCT TTCTGTTG GT AT T GGACT AT AT T G GT AT CAAC AT GG C T T C AC T GAAT T C CT G CAT T 138 0 

| M | | | M | | | | | I I I I M I I I I I II I I I I I I I I I I I I I I I I I M M I I M I I I I 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

| | | | | | | M | | | | I I II II I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I M I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| | | | | | | | M II I I II I M I I I I I I I M I I I I I I I M I I I I M I M I I I II I I I I I I I I I 
1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGT T CAAAG CT AAT GAT C AC G GAT AT GACAACTT C C GT T C C AGTAAT AAAT AC AGCT CA 1560 

|| | | | M II I I I II I I I I I M M I I I M I I II M I I I I I I 

1501 AAGT T C AAAG CT AAT GAT C AC G GAT AT GAC AACT T C C GT T C C AGTAAT AAAT AC AG CT C A 1560 

1561 T CT T GAAAGAAGAACT AT T C ACT GT AT T T CAT T T T CTT TAT AT T GGACC GAAGT CAT T AA 1620 

| | | | | | M | | | | | M II I I I I I I II I I I I I I I I I I I I I I I I M I I I I II II I I 

1561 T CT T GAAAGAAGAACT AT T C ACT GT AT T T CAT T T T CT T TAT ATT GGAC C GAAGT CAT T AA 1620 

1621 AAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAAC T AT GT AT T T G C AC AG C AC AC TAT 1680 

| | | | | | | | | | | | I I I I I I M I I I I I I I I I I I I I I MINI I I II II I 

1621 AACAAAAT GAAACAT TT GCCAAAACAAAACAAAAAACT AT GT ATTT GCACAGCACACT AT 1680 

1681 T AAAAT ATTAAGT GTAATTATTTTAACACT CACAGCTACATAT GACATTTTAT GAGCT GT 1740 

| | | | I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

1681 T AAAAT ATTAAGT GTAATTATTTTAACACT CACAGCTACATAT GACATTTTAT GAGCT GT 174 0 

1741 TTACGGCATGGAAAGAAAATCAGTGGG7^TTAAGAAAGCCTCGTCGTGA7^AGCACTTAAT 1800 

| || | || | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

1741 T T AC GGC AT GGAAAGAAAAT C AGT GGGAAT TAAGAAAGC CT C GT C GT GAAAGC ACT T AAT 1800 

18 01 T T T T T AC AGT T AGC ACT T CAACAT AGCT CT T AACAACT T C C AGGAT AT T C AC ACAAC ACT 18 60 

| | | | | | I I I I I I I I I II I I II I I I I I I M II I I I I I I I I I M I I I I 

18 01 T T T T T AC AGT T AGC ACT T CAACAT AGCT CT T AACAACT T C C AG GAT AT T C AC ACAAC ACT 1860 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

| | | M II I I I I I I I I IN II I I I I I I I I I I I I M I I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AAT CAAT GG GAC T CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAAC AGAACT T T T AAAT G 1980 

| | | | | | | | | | I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I M I I I I I I 

1921 AAT CAAT G G GACT CT GAT AT AAAGGAAGAAT AAGT CACT GT AAAAC AGAACT T T T AAAT G 1980 

1981 AAGCT T AAAT TACT CAATT T AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 2040 

| | | | | || I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I M M I II M II 
1981 AAGC T T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T TAAAACAAC T T T T CAAT T AAT AT 2040 

2041 TAT C AC AC T ATT AT C AGAT T GT AAT T AGAT GC AAAT GAGAGAGCAGT T T AGT T GT T G CAT 2100 

II | | M I I I I I I I I M I I I I II I I I I 

2041 TAT CAC AC T ATT AT C AGAT T GT AAT T AGAT GCAAAT GAGAGAG CAGT T T AGT T GT T GCAT 2100 

2101 TTTT CGGACACT GGAAACAT TTAAAT GAT CAGGAGGGAGT AACAGAAAGAGCAAGGCT GT 2160 

| | | M | | | | | | | | I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I M 

2101 T T T T C GGACACT GGAAACAT T T AAAT GAT CAGGAGG GAGT AACAGAAAGAGCAAGGC T GT 2160 



TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 

I I I I hum | MUM I Ml II I MINIMUM MINIM MINI I II 

TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 



2221 C AAC AT GT CAC AAACAAG C AGC AT GT AAC AG ACT GG C AC AT GT GC CAGCT GAATTT AAAA 

I | | | | | | || I M I II II II II II II I N II I I II I I N I N N I I II I I N I II I I I I I I 

2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 



2280 
2280 
2340 



2281 TATAATACTTTTAAAAAGAAAATTATTACATCCTTTACATTCAGTTAAGATCAAACCTCA 

I | | | M II I N I II I N I I II N N I II N II N I I I I I M MINIM 



2281 T AT AAT ACT T T T AAAAAGAAAAT TAT T AC AT CCTTT AC ATT CAGT T AAGAT CAAAC CT CA 
2341 



CAAAGAGAAAT AGAAT GT TT GAAAG GCT AT C CCAAAAGACTT T T T T GAAT CT GT CAT T CA 

I I I I M I II M I II I N I II I I I N M M M I II II N N INN II I I 

2341 CAAAGAGAAAT AG AAT GT T T GAAAG G CT AT C C CAAAAG ACT T T T T T GAAT C T GT CAT T C A 



2401 
2401 
2461 



CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 

I | | | M || | | | | | || I II I II II M I I I II I N I I N N N N II I I II N I I I I N I I I 

CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 



TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

I HIM II II II II I N I I II I I I I N I II I N I N II I I II I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 



2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 
I I I II I II II II I | | I II I I M II II N II I I N I I I II M I II 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 



2400 
2400 
2460 
2460 
2520 
2520 
2580 
2580 
2640 



2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

| | I I [ I I I I I I I I I I I I I I II I I I I I I I I I I I 1 I I I I 1 1 1 " 1 I Nlll 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 264 0 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I I I | | I | | I I I I I M I I I I M M II M M I I I I II M II I I I II M II II Ml 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

IIIMIMI I 1 I I I I I I I II I I N II I I I I N I II II INI 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

27 61 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 
I | | | | | | | | | | | I II II M I I I I II I I I N II I I I N N II I I N I II M II I II I I N I 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

| | | | | || || || || | || I II I II II II II I II I I N II N I II I N II I N I I I N I N I I 
2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

| | | | I I II I II I N I II I II II N I I II II I N I I I N N II N N I N N I II I 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
2941 GGGATGAGATGTGTGTGAAAGTATGTACAAGAGAAAACGGAAGAGAGAGGAAATGAGGTG 

N I II II I II I N I I II I I I M M M I I I I II II I I N N I I I m I N M^ MJ 

2941 



2700 
2700 
2760 
2760 
2820 
2820 
2880 
2880 
2940 
2940 
3000 



GGGAT GAGAT GTGTGT GAAAGT ATGTACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 3000 



30 01 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 30 60 

| | | | | | | | | | | | I I I I I I I II I I I II I M I I I I I I II I I I I I 1 I I I M I I I I I I I I I M I 
3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 

3061 C GT CAC AT C AAT G CAAAAGGT C CT GAT T T T GT T C C AGCAAAAC AC AGT G CAAT GT T CT C A 3120 

| | | | | | | | | M I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I M I I I I I I I I 

3061 C GT CAC AT CAAT G CAAAAGGT C C T GAT T T T GT T C C AGCAAAAC AC AGT G CAAT GT T CT C A 3120 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

I I I I I I Ill I I I I I II I I I I I M I I 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

|| | | | I || II I I I I I I II I I I I I I I I I I I I II II I II I I I I I I I I I I I 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3241 TTGTTTTCT GT CAAT AT T GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAAC AAT GT G G C C A 3300 

| | | | | | | I I I I I I I M I I I M I I I M I I I II I I I I I I I I I I I I I II I I I I I I II 

3241 TTGTTTTCTGT CAAT AT T GAAT GT GAT G GT AC AGT AAAC CAAAAC C CAAC AAT GT G G C C A 33 00 

3301 GAAAGAAAGAGCAATAATAATT AATT CACACACCATAT GGATT CTATTT ATAAAT CACCC 3360 

| | | | | M I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I 

3301 GAAAGAAAGAGCAAT AAT AAT T AAT T C ACAC AC CAT AT G GAT T CT AT T T AT AAAT CAC C C 3360 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

| | | | | | | | | | I I I I I I I II I I I I I II I I I I I I I I I M I I I I I I I II I I I I M I I I I I I M 
3361 ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T C AGAGGC CT GTT AT C AT AGAAGT 3420 

3421 CAT T T T AGAC T C T CAAT T T T AAAT T AAT T TT GAAT C ACT AAT AT T T T CAC AGT T TAT T AA 34 8 0 

| | | | M | | I I I I M I M I I I M I I II I I I I I I I I I I M II I I I I M I I I I I I I I III I I I 
3421 CAT T TT AGACT C T CAATT T T AAAT TAATT T T GAAT C ACT AAT AT T T T C AC AGTT T AT T AA 3480 

3481 TAT ATT TAATT T CT AT T TAAAT T T T AGAT T AT TT T T ATT AC CAT GT ACT GAAT T T TT ACA 3540 

| M I I I I I I I I I I M I I I I II I II I I I I I I I I M II I I I I I I I I I I I I M II I I I I II I I 
3481 TAT ATT T AAT T T CT AT T TAAAT T T T AGATT AT TT T T ATT AC CAT GT ACT GAAT T T T T AC A 3540 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

M I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I II I I I I I M I M I 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

3601 T GAAACT ACAC ACAAAAAGC AT ACT T GC AT TAT T T AT AAT AAAAT T G CAT T C AGT G GCT T 3660 

| | | I I M II I I I I I II I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I II 

3601 T GAAACT ACAC AC AAAAAGC AT ACT T GC AT T AT TT AT AAT AAAAT T G CAT T C AGT G GCT T 3660 

3661 T T T AAAAAAAAT GTT T GAT T C AAAACT T T AAC AT ACT GAT AAGT AAGAAAC AAT TAT AAT 3720 

M I I I I I I I I I I I I I I I M I II I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I I I 
3661 T T T AAAAAAAAT GTT T GAT T C AAAACT T T AAC AT ACT GAT AAGT AAGAAAC AAT TAT AAT 3720 

3721 T T CTT T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GC T AT CGT T CAAC T T CAAAAC AT GT 3780 

II | I I I I II I I II I I I I I I II I I I I II I I I I II II II 

3721 T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT CGT T CAAC T T CAAAAC AT GT 378 0 

3781 T T C CT AGT AT T AAGGACT T T AAT AT AGC AACAGACAAAAT TAT T GT T AACAT GGAT GT T A 3840 

I I M I 1 I I II I I I I II I II M I IIIIIIIMMIIIIIMM 

37 81 TTCCTAGTATTAAGGACTTTT^ATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 384 0 

3 841 C AG CT C AAAAGAT T T AT AAAAGAT T T T AAC CT AT TTTCTCCCT TAT TAT C C ACT G CT AAT 3900 



Db 


3841 


| | | | | | | | M 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 M M 1 1 1 f 1 M 
CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


3900 


Qy 


3901 


GTGGATGTATGTTCA7^CACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

I I I M | I I M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 

GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCC7WVGGAATACA 


3960 


Db 


3901 


3960 


Qy 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

I M 1 M 1 II 1 1 1 1 I 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 

GT TTATAGCAAAACAT GGGT AT GCT GTAGCTAACTTT ATAAAAGT GTAATATAACAAT GT 


4020 


Db 


3961 


4020 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

1 | I I || 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 

AAAAAAT TAT AT AT CT G GGAGGAT TTTTTGGTTGCC T AAAGT GGCT AT AGT TACT GAT T T 


4080 


Db 


4021 


4080 


Qy 

Db 


4081 
4081 


T T TAT TAT GTAAGCAAAAC C AAT AAAAATT TAAGT T T T T T TAACAACT AC CT T AT T T T T C 

1 | I || M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II II 

T T TAT TAT GT AAG CAAAAC C AAT AAAAAT T TAAGT T T T T T T AAC AAC T AC CT T AT T T T T C 


4140 
4140 


Qy 


4141 


ACT GT ACAGAC AC T AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 

| | | I 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 

ACT GT ACAGAC ACT AAT T CAT T AAAT ACTAAT T GAT T GT T TAAAAGAAAT AT AAAT GT GA 


4200 


Db 


4141 


4200 


Qy 


4201 


C AAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAGC AAGT AT GAAGT T ATT C AAT T 
1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 
C AAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAGC AAGT AT GAAGT TAT T C AAT T 


4260 


Db 


4201 


4260 


Qy 


4261 


AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 

1 I I I 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 
AAAATGCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAAAA 4301 




Db 


4261 





RESULT 2 
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Sequence 13, Application US/09921406C 
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SEQ ID NO 13 
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ORGANISM: Homo sapiens 



US-09-921-406C-13 



Query Match 99.6%; Score 4284.4; DB 10; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 42 85; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 G AGAC AT TCCGGTGGGG GACT CT G G C C AG C C C GAGC AAC GT GGAT C CT GAGAG C ACT C C C 60 

| | | | M I I M I I II 1 I I I I > I I I I I I I > HI 

Db 1 G AGAC AT T C C G GT G GGGGAC T CT GGC C AGC C C GAGCAACGT G GAT C CT GAGAG C ACT C C C 60 

Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

| | | | | | | | | | | I | | || I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I M 
Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AG GAT CAAC ACAGT G GCT GAAC ACT G G GAAGGAACT G GT ACT T G GAGT CT GG AC AT CT GA 180 

| M I I II I I I I I I I I I I I I M I I I I I I I I I M I I M I I I I I I I I I M I I I I I I II I I M I 
Db 121 AGGATCAACACAGTGGCT GAAC ACT GGGAAGGAACT GGTACTT GGAGT CT GGACAT CTGA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I M I II I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 181 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

| | | | | | | | I I I I II I I I I I I I I II II I II I I I I I M I I I I I I II I I I I I II I I I I 

Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

| || | | | | | | | | I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I II I I I I I I I I I M I I 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 AC C GC AGAGATAAT GACGC CAC C CACT AAGAC CT T AT GGC C CAAGGGT T C CAAC G C C AGT 420 

| | | | | | | | | I I I I I II I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I M I I M I I I I I I 

Db 361 AC C GC AGAGATAAT GACGC CAC C CACT AAGAC CT TAT GGC C CAAG GGT T C CAAC GCC AGT 420 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

| | | | | | | | | | | I I M M I I I I I II I I I I I I I I I I I M I I I I I I I I I M I I I I II I 

Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

Qy 481 C C AC GC AC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGACT T T CAAA 54 0 

| | | | | | | I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
Db 481 C CAC G CAC CAT CTCCCCTC C CC C GT GC CAAGGAC C CAT C GAGAT CAAG G AGACT T T CAAA 540 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

| | | | I I I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

Qy 601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C C AAT AT CT T GAT C G C C 660 

I I I I I I I I I I I I I ' I I I I I I I II I I I 

Db 601 CT T CT GAGAAT TAT C T ACAAGAACAAGT GCAT GC GAAAC GGT C C C AAT AT CT T GAT C G C C 660 

Q y 661 AG CT T GGCTCTGG GAGAC C T G CT GCAC AT C GT CAT T GAC AT C C CT AT CAAT GT C T ACAAG 720 

I I I I I I I I I I II II I I I I I I M I I I I I I I II I I I I I I I I M II II I I 

Db 661 AG CT T GG CT CT G G GAGAC C T GCT GCACATCGT CAT T GACAT C C CT AT CAAT GTCT ACAAG 72 0 

Qy 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M 

Db 721 CT GCT GGCAGAGGACT GGC CATTTGGAGCT GAGAT GTGTAAGCT GGT GCCTTTC AT ACAG 7 80 



7 81 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 84 0 

| | | I I I ! I I I I I I I I I M I I I II I 1 I I II 1 I II I I I I I I I I I M I I I I I I I I M I I I I M 
781 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 

841 GCTGTTGCTT CT T GGAGT AGAAT T AAAG GAAT T GG GGT T C CAAAAT GGAC AG CAGT AGAA 900 

| | | | | I I I I I I I II I I I I I I I I M I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 GCTGTTGCTTCTT GGAGT AGAAT T AAAGGAAT T G GGGT T C CAAAAT G GAC AGCAGT AGAA 900 

9 01 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

I | | I I I I I II I I I I I I I I I I I I I I I I II I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I 
901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

961 AT AAT T AC GAT GGAC T ACAAAG GAAGT TAT C T G C GAAT CTGCTTGCTT CAT C C C GT T C AG 1020 

| | | | | | I I I I I I I I I I M I I I I I I I I II I II I I M I I I I M I I I I I I I I II I I I I I I I I I 
961 AT AATTACGATGGACTACAAAGGAAGTT AT CTGCGAAT CTGCTTGCTT CAT CCCGTTC AG 1020 

1021 AAGACAGCT T T CAT G CAGT T T T ACAAGAC AG CAAAAGATT G GT G GCT GT T C AGT T T CT AT 1080 

I I I M II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I II M 
1021 AAGACAGCT TT CAT GCAGTTTTACAAGACAGCAAAAGATT GGT GGCTGTTCAGTTTCTAT 1080 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 1140 

| | | M I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I 
1081 TTCTGCTT GCC AT T GGC CAT C ACT G C AT TT T T T T AT ACACT AAT GAC CT GT GAAAT GT T G 114 0 

1141 AGAAAGAAAAGT GG CAT G C AGAT T GCT T T AAAT GAT C AC CTAAAGCAGAGAC GGGAAGT G 1200 

I | | I i I I I I I I I II I I I II I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I M I I I I 
1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 1200 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 12 60 

1261 AG C AGGAT T CT GAAGCT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 1320 
| M M I II I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I 

12 61 AGC AG GATT CT GAAG CT C ACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T TT G 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

| M | I I I M I I I I I I I I I I I I I M I I I I I I II I I I M I I I I I I I I I I I I I I I I I I M I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

13 81 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAA7\ACTGCTTTAAGTCATGCTTA 1440 

| | | | | | | I I I I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I M I I I I I M M I I I I I I 
1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 T GCT GCT GGT GC CAGT CATTT GAAGAAAAACAGT C CTT GGAGGAAAAGCAGT CGT GCTT A 1500 

| I I 1 I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1441 T GCT GCT GGT GCCAGT CATTT GAAGAAAAACAGT CCTT GGAGGAAAAGCAGT CGT GCTT A 1500 

1501 AAGT T CAAAGC T AAT GAT CAC G GAT AT GACAACT T C C GT T C CAGTAAT AAAT AC AGCT C A 1560 

| || | | | | | I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
1501 AAGT T CAAAG CT AAT GAT CAC GG AT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGC T C A 1560 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

I I I M I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1561 T CT T GAAAGAAGAACT AT T CAC T GT AT T T CAT T T T CT T TAT AT T G GAC C GAAGT CAT T AA 1620 



Qy 1621 AACAAAATGAAACATTT GCCAAAACAAAACAAAAAACTAT GTATTTGCACAGCACACTAT 1680 

| M | | | | | | | | I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I 1 I M I I I I I I I I I I I I I 
Db 1621 AAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAAC TAT GT AT T T G C AC AG C AC AC TAT 168 0 

Qy 1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| M I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 T AAAAT AT T AAGT GT AAT TAT T T T AAC ACT C AC AGC T AC AT AT G AC AT T T TAT GAGCT GT 1740 

Qy 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

| | | | | | | | I | I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

Qy 1801 TT T TT AC AGT T AGC ACT T CAAC AT AGC T CT T AACAACT T C CAGGAT AT T CAC ACAAC ACT 1860 

| | I M I I ! I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I 
Db 1801 T T T TT AC AGT T AG C ACT T CAAC AT AGCT CT T AACAAC TT C CAGGAT AT T CAC ACAAC ACT 18 60 

Qy 1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

I I I I I I I I I I M I I I I I I M I I I I I I I I I I I I M II I I I I I I M I I M I I I I I I I I I I I I 
Db 1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

Qy 1921 AAT CAAT GGGAC T CT GAT ATAAAGGAAGAAT AAGT C ACT GTAAAAC AGAACTT T T AAAT G 1980 

| | | | | I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I M I I I I M I I II I I I 
Db 1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAATAAGT C ACT GTAAAAC AGAACT T T T AAAT G 1980 

Qy 1981 AAG C T TAAAT TACT CAATTT AAAATT T T AAAAT C CT T TAAAACAAC T T T T CAAT TAAT AT 204 0 

| | | I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1981 AAGCT TAAAT TACT CAATT T AAAATT T TAAAAT C CTT TAAAACAACT T T T CAAT TAAT AT 2040 

Qy 2041 TAT CAC ACT ATT AT CAGAT T GT AAT T AGAT GCAAAT GAGAGAGC AGT T T AGT T GT T G CAT 2100 

| | | | | | | || I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I M I I II I I I I I I I I I M 
Db 2041 TAT CAC ACT AT TAT CAGAT T GT AAT T AGAT G CAAAT GAGAGAG CAGT T T AGT T GT T GC AT 2100 

Qy 2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

| | | | | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I M I I 
Db 2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 2160 

Qy 2161 T T T T GAAAAT CAT T AC ACT TT C ACT AGAAGC C CAAAC CT CAGC AT T CT GCAAT AT GT AAC 2220 

| | | | | M I I I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 2161 T T T T GAAAAT C ATT AC ACT T T C ACT AGAAGC C CAAAC CT CAG C ATT CT G CAAT AT GT AAC 2220 

Q y 2221 CAACAT GT C ACAAAC AAG CAG CAT GT AAC AGACT GGCACAT GT GC C AGCT GAAT T T AAAA 22 8 0 

| | | | || | | I I I I I I I I I I I I II I I I I I I I I II I I I I I I I M I I M I I M I I I I I I I I I I I 
Db 2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 228 0 

Qy 2281 TAT AAT AC T T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T CAGT T AAGAT CAAAC CT CA 2340 

I | I I I I || I II I I I II I II II I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I 
Db 2281 TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T CAGT T AAGAT C AAAC CT C A 2340 

Qy 2341 C AAAGAGAAAT AGAAT GT T T GAAAG GCT AT C C C AAAAGACT T T T T T GAAT CT GT CAT T C A 2400 

I I M I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I M I I II I I I I I I I M I I I I I I 
Db 2341 C AAAGAGAAAT AGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 2400 

Qy 2401 CAT AC C CT GT GAAG AC AAT ACT AT CT ACAAT T T T T T CAGGAT TAT TAAAAT CTTCTTTTT 2460 

I I I I I I I I I I II I I I I I I II I I II I I M I I I II I II I I I I I 

Db 2401 CAT AC C C T GT GAAGACAAT ACT AT C T ACAAT T T T T T CAGGAT TAT TAAAAT CTTCTTTTT 2460 

Qy 2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 2520 



Db 


Z 4 bl 


t r a r* t a t r p t a p c tt a a A P T P T GT T T GGT T TT GT C AT C T GT AAAT ACTT AC CT ACAT AC A 


2520 


Qy 


2521 


CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 


2580 




I | M I 1 I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 




Db 




PTPPATPTAPATrcATTAAATGAGGGPAGGPPCTGTGCTCATAGCTTTACGATGGAGAGAT 


2580 


Qy 


2581 


GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 


2640 




1 I I I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


ZD 0 1 


p p p a pt r a p pt p a t a a T A A a P, A PT GT C A APTGP PT GGT GP AGT GT C C ACAT GACAAAGGG 

bLLnu 1 bn.L b 1 bn. 1 nn 1 nnnbnb 1 w 1 ^nnL 1 ^ x uui ov^n\j x o x ^ v^n.v_^n x ununnnvuu 


2640 


Qy 


2641 


GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 


2700 




I | I I I 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 




Db 


Z b4 1 


bbnbb 1 Au^nLLL 1 ^1 1 bnb b bn 1 oL 1^1 uu X innnnJ. uu x x x x nu^nim \j x nx ***^x 


2700 


Qy 


2701 


GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 


2760 




1 I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


z / Ul 


rTT^TafTTafl AATAPTATTTTTPAAA ATP AT AT AP,ATT AP.T AP ATTT A AC AGPT APPTG 
\j b 1 J\ 1 J\\j 1 1 /\r\nn 1 nb Inl 1111 ^.ru-vrvtt. 1 bn 1 nbnbn 1 1 rio x nv^n x x x /vi^/nj^ ± n^ ^ i u 


2760 


Qy 


2761 


TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 


2820 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 M 1 1 




Db 


Z / bl 


m 7\ 7\ t\ /T^qr-p-ATT apt A ATTTTTPT ATT ATTTTTP,T A A AT AfJPP A AT AG A A A AGTTTGPTTG 
1 nnnbb 1 Inl Inb 1AR1 1111 olnl Inl 1111 bl/\rtnlnu^U/v\lno/Winui x x x x \j 


2820 


Qy 


2821 


ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 


2880 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


O O O T 

2821 


7\ a Tr , ri i rr r T ir T l T ir T i r f n 1 Tr , B r T i rT APAPPPAAA APTP.PTTTTTP, APAPPGT A AG A APPTPTT 
AbAl CjCjI bL lllltlll bn.1 b 1 Ab/\b b 1 bbl 1111 uAuAU^u l rtrturtn^^ i l i i 


2880 


Qy 


2881 


AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 


2940 




1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


0001 
Z o o 1 


7\r , r , T""PTr'TrT , PTTPPTPPPTA ATTTTT AT ATPTTPT A A HP A A AfTTf^PPTT AGGAT AGCTT 

AbL 1 1 i bl bbb 1 1 1 bbb Innl 111 lnln.1 LI ihrtuvnrtnUl X invjOrt.lr\VJ\- x x 


2940 


Qy 


2941 


G GGAT GAGAT GT GT GT G AAAGT AT GTACAAGAGAAAAC GGAAGAGAGAGGAAAT GAGGT G 


3000 




1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


2941 


pp/"Arnr , Ar*AT | r , Tr'Tr , Tr , AAAPTATPTAPAAPACA A A APCC A AP,AP,AP,AGGA A ATGAGGTG 
bbbA.1 bnbnl blblbl bnnnb 1 nl b 1 nbnnbnbnnnnb b bnnbnbnbn<J onnn X \js\\dvj x \j 


3000 


Qy 


3001 


GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 


3060 




1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


J U U 1 


r^PTTPP app A A appp at rzrzcrz A P AP,ATTPPP ATTPTT AGPPT A APGTTPGTC ATTGCCT 
bbb 1 1 bbnbbnnnLLL.nl bbbbnbnbnl X Hw-Lni 1^1 1 au^l. x nnv^o x x ± v-«n ± x vjv^v^ x 


3060 


Qy 


3061 


CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 


3120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


oUbl 


rr-Tr* a r* a Tnn at pp a a a apptpptp ATTTTf^TTPP ACC A A A AP AP AGTGP AATGTTCTCA 
L-b 1 LAtnl Lnnl blnnnnbb 1 bb 1 bnl 1 1 lbl 1 LLnbLnnnnLnLno X o^nnx ui i^i ^-n 


3120 


Qy 


3121 


GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 


3180 




1 I 1 I I 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 




Db 


J Iz 1 


r Ar-Trf" AfTTTPP A A AT A A ATTPP P PPP A A P. AP,PTTT A APTPGGTPTT A A A AT AT GCCCAA 

bnb 1 bnb 111b bnnn. 1 nnrt. 1 IbobLL bnnbn\JL XXX nnL X^OOX^X X r\r\r\r~^. x n x u^v- \-^nn 


3180 


Qy 


3181 


ATTT TTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCT AGT AATG 


3240 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 




Db 


3181 


ATTT TTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCT AGT AATG 


3240 


Qy 


3241 


TTGTTTTCTGT C AAT AT T GAAT GT GAT GGT ACAGT AAAC CAAAAC C C AAC AAT GT GG C C A 


3300 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 M 1 




Db 


3241 


TTGTTTTCTGT C AAT AT T GAAT GT GAT G GTACAGT AAAC CAAAAC C C AAC AAT GT GG C CA 


3300 


Qy 


3301 


GAAAGAAAGAGC AAT AAT AATT AAT T C ACAC ACC AT AT G GAT T CT AT T TAT AAAT C AC C C 


3360 



I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I M I I I I I I I I I I I I I 



Db 


3301 


GAAAGAAAGAGCAAT AAT AAT TAAT T C ACAC AC CAT AT GGAT T CT AT T T AT AAAT C AC C C 


3360 


Qy 


1 "3 £, 1 


TiraflnpfpfrrTTPTTTa ZiTTTP AT PPP A ATPAPTTTTTP Af^APrPrPPTPrTT ATP AT AC1AAGT 
AL-AAAL- IiollL.il 1 AA1 1 1 L.-M.1 L-L-Lr\rt.l \^J-\\^ 1 1111 ^nortuo^U 1 ol inl ^n.lnvjnn.ui 


3420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3361 


ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 


3420 


Qy 


o4Zl 


r* 7VT"T"T |r P ar 1 A r* t vr"Tr t A A. r PT ir PT 1 Zi ZV ATT A ZlTTTTPA ATP APT A AT ATTTTP AP AP,TTTATTAA 
LAI 1 1 1 AbAL 1L1 L-AA1 1 1 1 AAA1 1 AA1 111 Lj/Vt.1 ^A^lrtrvlrti 111 Uri^/ib 1 1 Inl irtrt. 


3480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3421 


CATTTT AGACTCT CAATTTTAAATTAAT TTT GAAT CACTAAT ATTTT CACAGTTTATTAA 


3480 


Qy 


O A O "1 


rp7vrp7\rnrpn"i7\ A r P r nTT* r P7\T"T ,l T l 7\ AA^^^'T , 7ir*Z\ r P r n2i r P r T 1 TTTATTAPP ATPT ZlPTmAATTTTTAPA 
1 Al Al 1 iAAi 1 ILiAl I InnAl I I lAbnl 1 Al 1 1 1 1/\1 1 .ML. L.M. 1 oiH.^ 1 Lr.rtrt.1 111 ±J-\^r\ 


3540 




1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 




Db 


3481 


TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 


3540 


Qy 


Jo4 1 


mr , r , mr , 7\m7\ PPPTurnTipprpfrpfppr ATTTT" 7\rT | AT , PAT , r r rTr r rrT7\ A r PTZ\Tf ,, T 1 TrPril AATTT 
I CC1 GAl AL.L.L-1 1 1 L-U 1 1 L. 1 LLAi (j I LAo 1 Al LAI bl 1L1 LlAAl 1/\1 Ull oL.L./\/\tt.l 1 1 


3600 




1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




Db 


3541 


TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 


3600 


Qy 


3601 


TGAAACTACACACAAAAAGLA.1 AL. 1 ILrLAl 1A1 1 1A1AA1AAAA1 IbLAJ. 1 L-ALr i LroL. 1 1 


O \J\J\J 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3601 


TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 


3660 


Qy 


3661 


mmm 7\ 7\ 7\ 7v t\ t\ 7\ T\m/^mmrpr , AT |r nP A A A A P'fTi'T'T 1 A AP ,| A r PAP'' ,m f~'*A m A A A A n Zl Zi Zi P* Zl Zi mm Zi m AAT 

TTTAAAAAAAA1 ol 1 1 LrAl 1LAAAAL1 1 lAAL.AlAUloAlAAvjlAAtaAAAL.AAl 1A1AM1 


^7? 0 




1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3661 


TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 


3720 


Qy 


3721 


mm mmm t\ r"T\rm\nmr'A 7\ 7\ t\ r*r*7\> APArpAPA A A A Arr r T 1 PP f PA r rPr r T ir PPSJ\PT r rPfi Z\& APATCT 1 

TTCTTTACAT ACTCAAAACLAAGA1 AGAAAAAbLiI (jL- 1 Al L-Lrl 1 L-AAL. 1 1 LAAAALAl bl 


^7 fin 




I M 1 1 l 1 1 1 1 1 1 l 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l 1 1 1 1 1 I I I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3721 


TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 


3780 


Qy 


37 81 


TTCCTAGiAi IAAGCjAL-I 1 1 AA 1 A 1 ALr L-AAL.ALrAL.AAAA 1 1 Al lLri l>A/\Li-il Lro/\l Lr 1 1/\ 


3840 




1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3781 


TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 


3840 


Qy 


3841 


^71 /-* r*rn r* 7\ 7\ 7\ 7\^7\mmm7\mAAAA^A^ mm T 1 AAr'r*T n A m T nm T n r , T^ PT PPT AAT 

CAGCTCAAAAGA1 1 1 Al AAAACjAI 1 1 1 AALL.1A1 1 1 1 L-l L-L-L 1 1 Al 1 Al L-u/\u i LrLi/^rt.1 


«_> _7 \J \J 




1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3841 


CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 


3900 


Qy 


o n n t 
3901 


nmr'r'Arnr'rTiAfPPrnrTiPA A APAPPTTTTTi CT* A mr P P , AT , A.r , r ,r P r PZif"*ZiTATPPPPA A Af^P, A AT AP A 
Cjl LtLtAI CjI Al bi 1L.AAAL-AL.L.1 1 1 lAolAl ILtAIAVjLI 1 Av^Aiai bo^Lrtrtnijurtnin^-n 


3960 




1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3901 


GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 


3960 


Qy 


3961 


GT l TA1 ALxLAAAALAl LtIjLtI Al oL- 1 Lrl AoL 1 AAL 1 1 1 /\ 1 I\t\t\r\\j 1 blnnlnimLmi oi 


4 020 




1 1 1 1 1 I 1 1 1 M 1 1 1 1 1 1 1 1 l 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 ll 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 


4020 


Qy 


A f\ O "1 

4 Uz 1 


a a a a a ATiTiA r rAT , BT | PT | PPPaPPa r r f T ir I 1 TTT^r.T r PP.PP r r A A APTPPPT AT AP,TT APTPrATTT 
AAAAAA1 1 Al Al Al L 1 vjoLtALtLtAI 11111 LtLj 1 1 bb^lhnnul LtLfL- ±t\XJ-WJ i ±f\\^ i on.i i i 


4 080 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 


4080 


Qy 


4 Do 1 


minmAT l T'AT , P r PA APraaniirPTiliT'ZvZiaZi ATTT A Z1PTTTTTTT A AP A APT APPTT ATTTTTP 
1 1 1 Al 1 Al Lr 1 AALjLAAAAL-LAAI AAAAA1 1 lAnbl 11111 lr\n.^nnk/ 1 i i.ri.1 i i i i v-- 


4140 




1 1 1 1 I I 1 11 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 i t 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 t 
1 M 1 1 1 1 I 1 II II II 1 M II II II II 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 I I I I i I i i < i i ii 




Db 


4081 


T T TAT TAT GT AAGC AAAAC C AAT AAAAAT T T AAGT T T T T T T AAC AAC T AC CT T AT T T T T C 


4140 


Qy 


4141 


ACT GT AC AG AC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 


4200 




M 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


4141 


ACT GT AC AGAC AC TAAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 


4200 



Qy 4201 CAAGT GGAC AT TAT T TAT GT TAAAT AT ACAAT TAT CAAG C AAGT AT GAAGT TAT T CAAT T 4260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 4201 CAAGT GGACAT TAT T TAT GT TAAAT AT ACAAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 4260 

Qy 4261 AAAATGCCACATTTCTGGTCTCTGGG 428 6 

I I I I I II I I I I M I I I I 1 I I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 3 

US-10-225-567A-113 

; Sequence 113, Application US/10225567A 

; Publication No. US20030113798A1 

; GENERAL INFORMATION: 

; APPLICANT: Lifespan Biosciences 

; APPLICANT: Brown, Joseph P. 

; APPLICANT: Burmer, Glenna C. 

; APPLICANT: Roush, Christine L. 

; TITLE OF INVENTION: ANTIGENIC PEPTIDES AND ANTIBODIES FOR G PROTEIN-COUPLED 

RECEPTORS (GPCRS) 

; FILE REFERENCE: 1920-4-4 

; CURRENT APPLICATION NUMBER: US/10/225, 567A 

; CURRENT FILING DATE: 2001-12-19 

; PRIOR APPLICATION NUMBER: 60/257,144 

; PRIOR FILING DATE: 2000-12-19 

; NUMBER OF SEQ ID NOS : 22 92 

; SOFTWARE: Patent In version 3.1 

; SEQ ID NO 113 

; LENGTH: 42 8 6 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-225-567A-113 



Query Match 99.6%; Score 4284.4; DB 15; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


1 


GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 


60 




1 1 1 M 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 




Db 


1 


G AGACAT T C C GGT G G G G GACT CT G GC C AGC C C GAG CAAC GT G GAT C C T GAGAGC ACT C C C 


60 


Qy 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 




I I I 1 M 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 


120 


Qy 


121 


AGGAT CAAC AC AGT G G CT GAAC ACT GGGAAGGAACT G GT AC T T GGAGT CT G GAC AT CT GA 


180 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


121 


AGGAT CAACACAGT GGCT GAAC ACT GGGAAGGAACT GGTACTT GGAGT CT GGACAT CT GA 


180 


Qy 


181 


AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 




| | | 1 | 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 




Db 


181 


AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 


240 


Qy 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 




I | I 1 I 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 M 1 1 1 




Db 


241 


CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 


300 



Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M 1 I I I I I I I I I I 
Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 48 0 

Qy 481 C C AC GC AC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT C AAGGAGACTT T C AAA 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 C C AC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAGGAGAC TT T C AAA 54 0 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

Qy 601 CT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC G GT C C CAAT AT CTT GAT C GC C 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I! I I I I I I I I I I I I I I I 
Db 601 CT T CT GAGAAT TAT C T ACAAGAACAAGT GC AT GC GAAAC G GT C C CAAT AT CTT GAT C GC C 660 

Qy 661 AGCTT GGCT CT GGGAGAC CT GCT GCACAT C GT CAT T GACAT CCCTAT CAATGT CT ACAAG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 AGCTT GGCT CT GGGAGAC CT GCT GCACAT CGT CAT T GACAT CCCTAT CAATGT CT ACAAG 720 

Qy 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

Qy 781 AAAGC CT C C GT GGGAAT C ACT GT GCT GAGT C TAT GT G CT CT GAGTAT T GACAGAT AT C GA 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I M I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 AAAGC CT C C GT GGGAAT C AC T GT GC T GAGT CT AT GT GCT CT GAGTAT T GACAGAT AT C GA 84 0 

Qy 841 GCTGTTGCTT CT T GGAGT AGAAT TAAAGGAAT T G GG GT T C CAAAAT GGACAGCAGTAGAA 900 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 841 GCTGTTGCTTCTTG GAGT AGAATTAAAGGAAT T G GG GT T C CAAAAT GGACAGCAGTAGAA 900 

Qy 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 9 60 

I I I I I M I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I 
Db 901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

Qy 961 AT AAT T AC GAT G GACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C CGT T CAG 1020 

I M I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 961 AT AATT AC GAT GGACT ACAAAGGAAGT T AT C T GC GAAT CTGCTTGCTT CAT C C CGT T CAG 1020 

Qy 1021 AAGAC AG CT T T CAT G C AGT T T T AC AAGACAG CAAAAGAT T G GT G GCT GT T C AGTT T CT AT 108 0 

I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I M I I II I I I I I I I I I II 
Db 1021 AAGAC AGCT T T CAT G C AGT T T T AC AAGACAGCAAAAGAT TGGTGGCTGTT C AGTT T C TAT 1080 

Qy 1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 114 0 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I 
Db 1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 114 0 



1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 12 00 

M | | | | | | | | I M I M I I I I I I I I I I II I I I I I I I II I I I 

1141 AGAAAGAAAAGT GG CAT GCAGAT T G C TT T AAAT GAT CAC CT AAAGC AGAGACGGGAAGT G 1200 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

| | | M I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 1260 

1261 AGCAGGATT CT GAAGCT CACT CTTT ATAAT CAGAATGAT CCCAATAGATGT GAACTTTT G 1320 

| | I I I I I I I I II I I I II I M I II I I I I I I I I I I I I I I I I I I 

1261 AGCAGGATT CT GAAGCT CACT CTTT ATAAT CAGAATGAT CCCAATAGATGT GAACTTTT G 1320 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

| | | | || | || | | I I I II I II I I I I I I II I II I I II I I I I I I I II I I I II I I I I I I I I I I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 144 0 

| | | | | | | | | | | | | | || I I II I I I I I I II I II I I I I I I I I I I I I I I I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

| | | | | | | | | I I I I I II I I II I I I I I M I I I I I I I I I I M I I I M I I I M I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1501 AAGT T C AAAGCT AAT GAT CAC GGAT AT GAC AACT T CCGT T CC AGT AAT AAAT AC AGCT C A 1560 

| | | | | | | | M | | | I I I I I I II II I I I I I I II I M I I I I I I I I I I I I I I I I M M I I I I I I 
1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 1560 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 162 0 

| | | | | || | | | | | | I I I I II I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I N I I I I N I 
1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

1621 AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 1680 

| | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I N I I I I I I I I 

1621 AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 1680 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

| | | | | | || | | | | | I I I I I II I I I I I I I I II M I I I I I I M I I I I I I M I I M I I I I I I I I 
1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

| | M | | | M I I I I I M I I I I I I I I M I I I I I M I I II I I I I I I I I I I I I I I I I I 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 18 60 

| | | | || || II I I II I I I I I I I I I I II I I I I I I I I M I I I I II I I I I I I I I I I I I I 

1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 18 60 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

| | | | | | M I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I 
1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

| | | | | | | M I I M I I I I II I I I I I I I I I I M I M I I I I I I I I II I I I I I I M I I I I I I I I 
1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 1980 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 2040 



Db 


1981 


|| | || | M 1 1 1 1 1 M M 1 II 1 1 1 1 1 1 M 1 1 1 I 1 1 M 1 1 1 I 1 1 1 1 1 M 1 1 1 I 1 1 1 

AAGC T T AAAT TACT CAAT T TAAAAT T T T AAAAT CCT TT AAAACAACT T T T CAAT T AAT AT 


on/in 
Z U4 U 


Qy 


2041 


TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GT T GC AT 

M 1 | I I I I 1 1 1 II II 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

TAT C AC ACT AT TAT C AGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GT T G CAT 


Z 1UU 


Db 


2041 


Z 1 U U 


Qy 


2101 


TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

Ml 1 1 1 1 1 1 1 M M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 

TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 




Db 


2101 


Z 1 bU 


Qy 


2161 


TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 

I I 1 | I | | | I I 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 

TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 


O O O f\ 

ZZZv 


Db 


2161 


o o o n 

zzz u 


Qy 


2221 


CAACAT GT C ACAAACAAGC AGC AT GT AACAGACT G GCAC AT GT GC CAGCT GAAT T TAAAA 

I | | | | | | | | || 1 1 1 1 1 1 1 1 M II 1 1 1 II II 1 1 1 1 M 1 M 1 1 1 1 

CAACAT GT C ACAAACAAGC AG CAT GT AACAGACT GG CAC AT GT G C CAG CT GAAT T TAAAA 


o o o n 
ZZ o U 


Db 


2221 


o o o n 


Qy 


2281 


TAT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT T AAGAT C AAAC CT C A 

| | | M | | || || I M II 1 1 1 1 1 1 1 1 1 1 1 II M 1 II 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

T AT AAT ACT T T TAAAAAGAAAAT TAT T AC AT C CTT T AC ATT C AGT T AAGAT C AAAC CT C A 


2340 


Db 


2281 




Qy 


2341 


C AAAGAGAAAT AGAAT GT TT GAAAGGCT AT C C CAAAAGACT T T T T T GAAT CT GT C ATT C A 

| | I I I M | I 1 1 II 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II Ill 

C AAAGAGAAAT AGAAT GT TT GAAAGGCT AT C C CAAAAGACT T T T T T GAAT CT GT C ATT C A 


2400 


Db 


2341 


24UU 


Qy 


2401 


CAT AC C CT GT GAAGACAAT ACT AT CT AC AAT T T T T T CAGGAT T ATT AAAAT CT T CT T T T T 

| M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 

CAT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T CAG GAT TAT T AAAAT CT T CT T T T T 


o a c n 
24 bU 


Db 


2401 


24 oU 


Qy 


2461 


TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGT CAT CTGT AAAT ACTTACCTACATACA 

I | M | | | I M 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 


o c. o n 
ZdZu 


Db 


2461 


o c o n 
ZdZ U 


Qy 


2521 


CT GCAT GT AGAT GAT T AAAT GAGG GC AGG C C CT GT GCT C AT AGCT T T AC GAT GGAGAGAT 

| | | | || | | | | | | II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 N 1 N 1 N 1 II 1 

CT GCAT GTAGAT GATTAAAT GAGGGCAGGCC CT GT GCT CAT AGCTTT ACGAT GGAGAGAT 


O C O A 

2 58 0 


Db 


2521 


o c o n 
Zoo U 


Qy 


2581 


GC C AGT GAC CT CAT AAT AAAGACT GT GAACT GC CT G GT GC AGT GT C CAC AT GACAAAG GG 

1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 .Milium 

GC C AGT GAC CT CAT AAT AAAGACT GT GAACT G C CT G GT GC AGT GT C CAC AT GACAAAGGG 


Z o4 U 


Db 


2581 


Z o4 U 


Qy 


2641 


GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I | I | | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 


Z 1 uu 


Db 


2641 


o n n n 
Z 1 UU 


Qy 


2701 


GCT ATAGTTAAAAT ACTATTT TT CAAAAT CATACAGATTAGTACATTTAACAGCTAC CT G 

| M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 


o n £ n 
Z 1 b U 


Db 


2701 


z /ou 


wy 


2761 


TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

| | || 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 M 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 


2820 


Db 


2761 


2820 


Qy 


2821 


ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

| || || 1 1 1 1 1 1 1 1 1 II 1 M 1 1 II II 1 M 1 II II 1 1 1 1 1 1 1 1 II 1 M Ml 


2880 



ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 28 8 0 



28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

] I ! I I I I I I I I I I I I I I t 1 I t I I I ! I I 1 I I I I I I I I I I I 1 I I 1 I I I 1 I I I I I I I I I I I I I 

28 81 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 



2941 
2941 
3001 
3001 
3061 
3061 
3121 
3121 
3181 
3181 
3241 
3241 
3301 
3301 
3361 
3361 
3421 
3421 
3481 
3481 
3541 
3541 
3601 



GGGAT G AGAT GT GT GT GAAAGT AT GT ACAAG AGAAAAC G GAAGAGAGAGGAAAT GAGGT G 

11,1111 M I I I I I I I t I I 1 1 I I I 1 I I I I I I I I I I 1 I 1 I I I i t I I I I I 1 I 

G GGAT GAGAT GT GT GT GAAAGT AT GT AC AAGAGAAAAC GGAAGAGAGAG GAAAT GAGGT G 



GG GT T GGAGGAAAC C CAT GG GGAC AGAT T C C CATT CT TAG C CT AACGT T C GT CAT T G C CT 

I I I I M I I M I I I I I I I I II I I I M I M I I I I I I I I I M I I I I I I I M I I M 

GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

C GT CAC AT CAAT G CAAAAGGT C CT GAT T T T GT T C C AGC AAAACACAGT GC AAT GT T CT C A 
| 1 | | I | | | | | 1 | 1 I 1 1 I I I 1 I I 1 1 1 I I I 1 1 I I I I 1 I I I I I I I 1 I I I I > I ■ I ■> 1 ■> I 1 1 1 
C GT CAC AT CAAT GCAAAAGGT C CT GAT T T T GT T CC AG CAAAAC AC AGT GCAAT GT T CT C A 

GAGTGACTTTC GAAAT AAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAAT AT GCCCAA 

I I I 1 I I I I 1 I I 1 1 I I 1 I I I 1 I 1 1 I I 1 I I i I 1 1 I 1 INI MINIMI 

GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAAT AT GCCCAA 
ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAA.TG 

M | | | | | | | | | || | M II I I I I I I M I M II II II I II II II M M I I I M I I M I I I M 

ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 
TTGTTTTCTGT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 

| | | | | || Ml I M II I I I M I I I I I M I II M I II M II I I I M I I M 

TTGTTTTCT GT CAAT AT T GAAT GT GAT G GT AC AGT AAAC CAAAAC C CAACAAT GT GGC C A 
GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 

| || | M | M M I I I II I M M II I I M I I I I I II M I I I I II M I I I II M M 

GAAAGAAAGAGCAATAATAATTAATT CACACACCAT AT GGATT CTATTT ATAAAT CACCC 
ACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T CAGAGGC CT GTT AT C AT AGAAGT 

I | | m | M II I I I I I I M II I I I I M M M M I I I I I I I I I M M I M M II I I I I I M I 

ACAAACT T GT T CT T T AAT TT CAT C C CAAT C ACT T T T T CAGAGGC CT GT TAT CAT AGAAGT 
CAT TT T AGACT CT CAAT T TT AAAT T AAT T T T GAAT C ACT AAT AT T T T CAC AGTT T ATTAA 

I I I | | | | I I I M I I I M M M I I I M I M M I I I II I I I M I I I MIMMMM 

CAT T T T AGACT CT CAAT T TT AAAT T AAT T T T GAAT C ACT AAT AT T T T CAC AGTT TAT T AA 
TAT AT T T AAT TT CT AT T T AAAT T T TAG AT TAT T T T TAT T AC CAT GT ACT GAATT T T T AC A 

I | | | | || || || | | I II I I II I II I II I I I I I M I I I M I I I I I M M I I I I I I I M II M 

TAT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAATT TT T ACA 
TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCC7\AATTT 

I I M II I I I I I I I M M I It I I I I I I I I I I I M I I I I M 1 M I M I I I I I M I M 

TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 



T GAAACTACACACAAAAAGCATACTT GCATTATTTATAATAAAATTGCATTCAGTGGCTT 

M in | | || | || Mill I MM Ml I I Ml Mill Mill Mill Mill 

3601 T GAAACT AC AC ACAAAAAGC AT AC T T G CAT TAT T T AT AAT AAAAT T GC AT T CAGT GGCT T 



3661 T T T AAAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GAT AAGT AAGAAACAAT TAT AAT 

I I 1 1 1 I 1 1 1 I I t I I I I I I I I t M I I I I I 1 1 M M I II II II I I II I I I II I M I I 

3661 T T T AAAAAAAAT GT T T GAT T CAAAAC T T T AAC AT AC T GAT AAGT AAGAAACAAT TAT AAT 



2940 
2940 
3000 
3000 
3060 
3060 
3120 
3120 
3180 
3180 
3240 
3240 
3300 
3300 
3360 
3360 
3420 
3420 
3480 
3480 
3540 
3540 
3600 
3600 
3660 
3660 
3720 
3720 



Ov 


3721 ' 


Db 


3721 


Ov 


3781 


Db 


3781 


Ov 

wy 


3841 


Db 


3841 


yy 


3901 


Db 


3901 


yy 


3961 


Db 


3961 


wy 


4021 


Db 


4021 


wy 


4081 


Db 


4081 


wy 


4141 


Db 


4141 


Qy 


4201 


Db 


4201 


Qy 


4261 


Db 


4261 



T T C T T T AC AT ACT CAAAACCAAGAT AGAAAAAG GT G C TAT C GT T CAACT T CAAAACAT GT 

I | M | | | | | | | | | | M I I II I I Illllll I M I I I I II II I I I I I 

T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAACT T CAAAACAT GT 
TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

IIIMIIIIIMIIIMIIIIIIlMIIMMMIIIilMIMIIIIMIIMIMIII 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 
C AGC T CAAAAGAT T TAT AAAAGAT T T T AAC CT ATT T T C T C C CT TAT TAT C CACT GCT AAT 

I || | M M I I I I M I i M I I I I I MINIM I i , I I i I I I I I I I I 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

GT GGAT GT AT GT T CAAAC AC CTT T T AGT AT T GAT AGCT T ACAT AT GGC CAAAGGAAT AC A 

| M I M II I M M M M I M M I I I I M M I II I M M II II I I II I I 

GT GGAT GT AT GT T C AAACAC CT T T T AGT AT T GAT AG CT T AC AT AT GGC CAAAGGAAT AC A 

GT T TAT AG CAAAACAT GGGT AT GCT GT AG CT AACT T T AT AAAAGT GT AAT AT AACAAT GT 

I M I I I I M I I I I I II I I M I M II I II I I M I I I I I I M II I I M I I I I I I M I 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 
AAAAAAT TAT AT AT CT G GGAGGAT TT T T T GGT T GC CTAAAGT GGC TAT AGT TACT GATTT 

I | | || || | | | | M I I II I I II I M I I I M II I II I I I I I I I I I I I I I M II II I M II II 

AAAAAAT TAT AT AT CT GGGAGGAT TTT T T GGT T GC CTAAAGT GGCT AT AGTT ACT GATTT 
T T TAT TAT GT AAGCAAAAC CAAT AAAAAT T TAAGT T T T T T T AACAACT AC CT T AT T T T T C 

I | | | | M | | || | | I I II I II I I I II I I I M I I II M I I I II M M II I II I I I M I II II 

T T TAT TAT GT AAGCAAAAC CAAT AAAAAT T TAAGT t T TT T T AACAACT ACCT TAT T T T T C 

ACT GT AC AGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T TAAAAGAAAT AT AAAT GT GA 

| | | | | | | | | | | | I I I II M I II M I I M I I M M II I II I I I M I M I I 

ACT GT ACAGAC ACT AAT T CAT T AAAT ACT AAT T GAT T GT T TAAAAGAAAT AT AAAT GT G A 

C AAGT G GAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAGC AAGT AT GAAGT TAT T CAAT T 

| | | | | | | | | | I I II I II I M I I I II II I I I I M I I I I II I I II I M I I I I M I I I 

C AAGT GGAC AT T AT TT AT GT T AAAT AT ACAAT TAT CAAGC AAGT AT GAAGTT AT T CAAT T 



I I I II II II I M II I II I M I I I M I 



3780 

3780 

3840 

3840 

3900 

3900 

3960 

3960 

4020 

4020 

4080 

4080 

4140 

4140 

4200 

4200 

4260 

4260 



RESULT 4 

US-10-007-926A-177 

Sequence 177, Application US/10007926A 
Publication No. US20030143539A1 
GENERAL INFORMATION: 
APPLICANT: BERTUCCI, FRANCOIS 
APPLICANT: HOULGATTE, REMI 
APPLICANT : BIRNBAUM, DANIEL 
APPLICANT: NGUYEN, CATHERINE 
APPLICANT: VIENS, PATRICE 
APPLICANT: FERT, VINCENT 

TITLE OF INVENTION: GENE EXPRESSION PROFILING OF PRIMARY BREAST CARCINOMAS 
TITLE OF INVENTION: USING ARRAYS OF CANDIDATE GENES 
FILE REFERENCE: 1546-R-00 

CURRENT APPLICATION NUMBER: US/10/ 007 , 926A 



CURRENT FILING DATE: 2001-12-07 
PRIOR APPLICATION NUMBER: 60/254,090 
PRIOR FILING DATE: 2000-12-08 
NUMBER OF SEQ ID NOS : 468 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 177 
LENGTH: 4286 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: endothelin receptor type b (EDNRB) gene. 
US-10-007-926A-177 

Query Match 99.6%; Score 4284.4; DB 15; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0; Gaps u; 
Ov 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

I I I | | | | | | | I I I I I I I I I I I I I I I I I I I I I I M I I II I I II I I I I 

Db i GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 



0v 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

Y | | | | | | | | | | | | | | M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

Ov 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 

| | | | M I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I M I I I 

Db 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 



Qy 181 

Db 181 

Qy 241 

Db 241 



AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

| | | | | | | | | | | M I II I I I I Ml I I I I I II I I I I I I I I I I I I I I M M I I I I I I 

AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

I I I I I I M | I I I I I I I I I I I I M I I I I I I I I I M I I I I I M I I I I I M 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 



Ov 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

| | | | | | | | | | | | | | || I I I I I I I I I I II I I I I I I M I I I II I I I I I M I I I I II I I I I M 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 



Qy 361 
Db 361 



ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

Ml | || | | | I I I I M I I I I I I M I I I I I I I I I I I M I I I M I I I I I I I I I 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 



Ov 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

I || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

Ov 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

I | | | | | | | || | | | I I I I I II II I I I M II I I I I I I I I I M I II I I M I I M M I I I I I I I 

Db 481 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 



Qy 541 
Db 541 



TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : m 1 1 1 1 1 1 1 

TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 



120 

120 

180 

180 

240 

240 

300 

300 

360 

360 

420 

420 

480 

480 

540 

540 

600 

600 



601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 

1 1 1 1 1 1 m 1 1 1 1 i ii 1 1 1 1 m 1 1 1 1 1 n 1 1 1 1 1 m i_m i^m m 1 

601 CTT 



CTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 



661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 



841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 



901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

| | | | | | | | | | | | | | I I I II I I I I I M I I I I M M I I I I I I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 
I I i | | | ', I ] I I I I M I I I I II ! I I I I I I i I I I I ' M I M M I I I I I M I I I I I i M I I 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 

1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 
| | M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

| | | M | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

1141 AGAAAGAAAAGT GGCAT GCAGATT GCTT T AAAT GATCACCTAAAGCAGAGACGGGAAGT G 

| | | I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I 

1141 AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 



1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

MIIIIIIIIIIIIMIIIMIIMIIIIIIIMIIIIIIIMIMMI 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

| | | | | | | | | | | | | | || I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 
M | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I M I I I 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

|| I I I I II I I I I I I I I I I I II M I I I I I I I Ml 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



960 

960 

1020 

1020 

1080 

1080 

1140 

1140 

1200 

1200 

1260 

1260 

1320 

1320 

1380 

1380 

1440 

1440 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 



Db 


1441 


1 1 1 | | | | | I I I 1 1 1 1 M 1 1 1 1 1 1 1 M M M 1 1 II 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 M 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 


1500 


Qy 


1501 


AAGT T C AAAGCT AAT GAT C AC GG AT AT GACAACTT C C GT T C CAGT AAT AAAT AC AGCT CA 

| | | | | | | | I I I I II 1 1 1 1 ! 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

AAGT T C AAAGCT AAT GAT C AC GGAT AT GACAACTT C C GT T C CAGT AAT AAAT ACAGCT CA 


1560 


Db 


1501 


1560 


Qy 


1561 


T CT T GAAAG AAG AACT AT T C ACT GT AT T T CAT T T T CT T TAT AT T G G AC C GAAGT CAT T AA 

I | | | | | M I I M 1 1 1 II Mill M 1 1 II 1 1 1 1 M 1 

T CT T GAAAG AAGAACT AT T C ACT GT AT T T CAT T T T CT T TAT AT T G G AC C GAAGT CAT T AA 


1620 


Db 


1561 


1620 


Qy 


1621 


AACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTATGTATTTGCACAGCACACTAT 

IMIMM 1 1 1 1 1 M 1 t t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

AACAAAATGAAACATTTGCCAAAACAAAACAAA7\AACTATGTATTTGCACAGCACACTAT 


1680 


Db 


1621 


1680 


Qy 


1681 


TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 

IMMIM MIMIIIIIIIMIIIIIMIIIIII IIMIIIIMIII 

TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 


1740 


Db 


1681 


1740 


Qy 


1741 


T T AC GG C AT G GAAAGAAAAT CAGT GG GAATT AAGAAAGC C T C GT C GT GAAAGCAC TT AAT 

| M | | | M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 M II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 M 1 1 1 II 

TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 


1800 


Db 


1741 


1800 


Qy 


1801 


TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 
| M | | | M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 
T T T T T AC AGT T AGC ACT T CAACAT AG CT C T T AACAACT T C C AGGAT AT T C AC ACAACACT 


1860 


Db 


1801 


1860 


Qy 


1861 


T AGGCT T AAAAAT GAGCT CACT C AGAAT T T C TAT T CT T T CT AAAAAGAGAT T TAT T TTT A 

| | | | | | | | | | | | | I I || I 1 1 1 II II 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 


1920 


Db 


1861 


1920 


Qy 


1921 


AAT CAAT GGG ACT CT GAT AT AAAGGAAGAAT AAGT CACT GTAAAACAGAACT T TTAAAT G 

| M | | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AAT CAAT GGGAC T CT GAT AT AAAGGAAGAAT AAGT CACT GTAAAACAGAACT T T TAAAT G 


1980 


Db 


1921 


1980 


Qy 


1981 


AAGC T T AAATT ACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT TT T CAAT TAAT AT 

| | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 II II 1 II 1 1 1 II 1 1 1 M 1 1 1 1 II 

AAGCT T AAATT ACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT TT T CAAT TAAT AT 


2040 


Db 


1981 


2040 


Qy 


2041 


TAT CACACT ATTAT CAGATT GTAATT AGATGCAAATGAGAGAGCAGTTT AGTT GTTGCAT 

I | | | I I M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II II 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 

TAT CACACT ATTAT CAGATT GTAATT AGATGC AAAT GAGAGAGCAGTTTAGTT GTT GCAT 


2100 


Db 


2041 


2100 


Qy 


2101 


T T T T C GGACACT GGAAAC AT T TAAAT GAT C AGGAG G GAGT AAC AGAAAGAGC AAGGCT GT 

| | | | | | I I I I I I 1 1 1 1 1 M 1 1 1 M II 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

TTTT CGGACACT GGAAACATTT AAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 


2160 


Db 


2101 


2160 


Qy 


2161 


TT T T GAAAAT CAT TACACT T T CACT AG AAGC C CAAAC C T CAG CAT T CT GCAAT AT GT AAC 
| | | | | | | | I 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 I 1 1 1 1 1 1 1 i 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 M I 1 M 
TTTT GAAAAT CAT TACACT T T C ACT AGAAGC C CAAAC C T CAG CAT T CT GCAAT AT GT AAC 


2220 


Db 


2161 


2220 


yy 


2221 


CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAATT TAAAA 

IIMMIIIIII IIIIIIIIMIIMIIIMIIIIIIMIIIIMIIIIIMI 

CAACAT GT CACAAACAAGCAGCAT GTAACAGACT GGCACAT GT GCCAGCT GAAT T TAAAA 


2280 


Db 


2221 


2280 


Qy 


2281 


TAT AAT ACT T T TAAAAAG AAAAT TAT T AC AT C CT T T AC AT T CAGT T AAG AT C AAAC CT C A 

I | I I I I I 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 M 1 1 1 M II 1 1 M 1 1 1 


2340 



TTT AAAAAGAAAAT TATTACAT CCTTT ACATT CAGTTAAGAT CAAACCT CA 2340 



2281 TATAATACT 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 

Mllllll | | | | I I 1 I 1 f I I I 1 I I I 1 I I I ! I I I I I 1 1 I t 1 I I I I I I I I I 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAATCTGTCATTCA 



2401 



CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 



2400 
2400 
2460 



Mllllll II II Ml MM M I M I I I I I I I I I I I MINI 

2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 



2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

> M M I M I I I I M II I M M I I M I I I I I M I I I M I I I M M I I I II I I I M I I M II 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

| || | I | | | | | || | | | | || I II I II I I I I I I I M I M I I I II I II I M I I I 

2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 

I I I I I I | | | M | I I I I M I I I I M I I I M M I I IIMMMMIMIMI 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 
2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

i i I i 1 I I I I I | I I 1 I I I I I I I I I I I I I I I I I M I I M II II I II I M I I I I I M I 

2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2520 
2520 
2580 
2580 
2640 
2640 
2700 
2700 
2760 



2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

| | | | I I I | I I I M I M M II I I I I I I M I I I I M I I I I M I M I II I I I I I M I 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 2760 



2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

M I | || | | | | | | | | | | II I I I I I II I I M I I I I M M I I I I I I I I M II II M II II I M 
2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

I I | I | | | M | | | M | | M II I I II M 1 I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 
2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

I | | | | | | | | || I I I M I I I I II M I I I I I M I I II I I I M I II I II I I M M I II I I I I I 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

2941 GGGAT GAGAT GT GTGT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAATGAGGT G 
|| I || | M | | M I I I I I I I I I M I I I M M M I I I M I II I I I I M M I M II I I I M M 
2941 GGGAT GAGAT GT GTGT GAAAGT AT GTACAAGAGAAAACGGAAGAGAGAGGAAATGAGGT G 



3001 



2820 
2820 
2880 
2880 
2940 
2940 
3000 
3000 
3060 



GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

I | | | | | | | | | | | I I I I M I I I I I I I M I I I I I I I M II I M I I I II I I I I M I I I 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 



3061 
3061 



CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 

I I I | | | I I M I I II I I I II I II I I II II I I I I M II I II II I I I I I I I I M M I I II I I I 

CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 



3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

I Ml Ml I M I I I I I I I M I II II I I M I II I II Hill Ml I MM MM " 

3121 



3180 



GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 



ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 32 4 0 



Ol Ol 

MMIlTTililillMMMIIIIIIIIMIIIIIIIIIIIIIIIMIIIIIIMIIh 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 

Mllll I II I | I II I I I I I I I I M I I I I I M MINI I Mill I UN I I 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 

3301 G AAAGAAAGAGC AAT AAT AATT AAT T CAC AC AC CAT AT GGAT T CT AT T TAT AAAT CAC C C 

| | | | M It I I I 1 t I IIIMMIIMII I II I IN Mill I 

3301 G AAAG AAAGAG C AAT AAT AAT T AAT T CAC AC AC CAT AT G GAT T CT AT T TAT AAAT CAC C C 



3240 
3300 
3300 
3360 
3360 
3420 



3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 

>M|| uiiiiiillllllMIIIIIIIIIIIIMMIIIIIMMIIIIIIIII 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 



3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 
I I I I | | i | | ) | | I | | | I I I I I 1 I I 1 I I I I 1 I t ■ 1 1 I I I 1 I I I I 1 I 1 t 1 I I I I I 1 I I I I 1 I 
3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 

IMIIMIIIIIIIMIIMIIIMIIIIMIIIIIIIIIMIIIIMIMIIIIIMM 

3481 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 
I | MMiillM Mllll IMH I MINI INN Mill MM M MINIM 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

I I I I | | | | | | || | || I M II I I M II I I M II II I M I I M II I I M I I M II M I M M 
3601 T GAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATT GCATTCAGTGGCTT 



3480 
3480 
3540 
3540 
3600 
3600 
3660 
3660 
3720 



3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 

I I I MINIM I II I I II II I I M II I I I I I I I I II I I II I I I II I N I I I N I II I I I I 

3661 TTTAAAAA?JJ^TGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 3720 



3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

I I I I I I I I I | M II I I I M II M II I N I I M II M II I II II I I II I I II II II I I I II 

3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCAAAACATGT 

3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 
| | | | | M | | I II I N I M I I I I I I M I II I I I II II I I I N II II M M II I M I I MN 
3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

I I Ml Mill M I I I M I II M I II I I II I I I II II I I I I I II M I I I II IN 

CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 



3841 
3841 
3901 

3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 
3961 GTTTATAGC. 



GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 

I II M II I I II M I II I II II I M I I N N I N I II i I II II I iiliiliililiiiiii ' ' 



3780 
3780 
3840 
3840 
3900 
3900 
3960 
3960 



AAAACAT G GGT AT G CT GT AG CT AACT TT AT AAAAGT GT AAT AT AACAAT GT 4020 

i I I i i I | | | | M II I I M M I I M II N I II II N II II I M I II I II I I N I I I 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 4020 



4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 4080 

4021 AAAAAATTATAT ,,,,,,, ,,,,,,, M , , | | | | | , | I I I II I I 

4021 AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTA 4080 



Db 



4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 4140 
Qy 4081 ^TJVTTATGTAA | | | | | | || I I I I I M i 

Db 4081 TTTATTATGTAAGC. 



:AAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 4140 



Ov 4141 ACTGTACAGACACTAATTCATTAAATACTAATTGATTGTTTAAAAGAA^TATAAATGTGA 4200 

QY YlllMMIIIIIIIIIMIIIIIIMIIIIMIIIIIMIIINMIINIIIIMIII 

Db 4141 ACT GT AC AGAC ACTAAT T CATT AAAT ACT AAT T GAT T GTTT AAAAGAAAT AT AAAT GT GA 



4200 

Qy 4201 CAAGT GGACATT ATTTAT GTTAAATAT ACAATTAT C^AGC.^GTAT G^^GTTAT T ^'^'j' ^ 4260 

Db 4201 CAAGT GGACATT ATTTAT GTTAAATAT ACAATTAT CAAGCAAGTAT GAAGTT ATT CAAT T 4260 



Qv 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 

| | I I I I I I I I I I I I I M I I I I I I I I I 
Db 4261 AAAATGCCACATTTCTGGTCTCTGGG 4286 



RESULT 5 
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SEQ ID NO 15 
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Query Match 99.6%; Score 4284.4; DB 15; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 4285; Conservative 0; Mismatches 1; Indels 0, Gaps 
I MM Ml Mm , M M II I I I I M I I I I I I I I I I M I Mill 



nv 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

Db 1 Ia G AC^ 60 



Ov 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Y TTTTTT??Tl | II I I II I 1 II I I I II I I I M M I M ! I II I I I I 1 I I I I I M I I I MIM 

Db 61 AGGTAGGCA^TC 



120 



121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 



L | | | | | | | | | | | | I I I I I I I I I I I I I I liililiiliiliiiilj.iiii,^ 

121 — — ^T^a-r 



AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 1 8 0 



181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

I I I I I I | | I I I I I I I I II I I MMIMMIMMMMMMMMMMMMIMM 

1 8 1 AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 2 4 0 
241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I M I II I I I M I I I II I I I I M I I iilliiiiiiliiiiiiii.iiiii,! 
241 ~ " " " ,m '"^ m '~ 

301 



CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 
TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 



361 
421 
421 



481 
541 
541 



| | | | | M | | | || M I I I I I I I I I I I I I I II M I I I I I I I I I I I 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGv- j. j. n^nn 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

Mill MINN! IIMIIIIIIIMIIIIIIIIIIMIIIIMMIM 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

| | I I | | | | | | | | | | | | | | | | | | | | | II I I I I I I I I I I I I I I I I I I II I I I I M I 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

I I I I I I I I | | | | | | | | | | I M I I I I I II I I I II I I I I M I I I M I I I I I I I I M I 

CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAP 
TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCAC^ 

{ I I | I I I I I I II I I I I I I I I I I I I I I M I I Ml I I I I I I I M I I I I I I I I I I I II 

TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCAC^ 
CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 

I t M I I I ! > ! '< M M I I I I I > I I I 1 N II II I II Ml I MM 111111111 ' " 



601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 
661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

I I I I I 1 I I | | I M I I I I 1 I I I M M 1 I I 1 M 1 1 I 1 I I I I 1 I 1 M I I I I I Mill 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 780 

| I | | | | | | | I M II I I I I II I I I I I M II I MM I II I I II II I M M I I M II I I I II I 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 8 0 

7 8 1 AAAG CCTCCGTG GGAAT C ACT GT GCT GAGT CT AT GT GCT CT GAGT AT T GAC AGAT AT C G A 8 4 0 

I II I II II II I II I I I I I I I M I I II I I I 1 1 I I Ml M illlUllllIliliiiilli 

781 ~ •» 



AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 

841 GCTGTTGCTT CTTGGAGTAGAATTAAAGGAATT GGGGTT CCAAAAT GGACAGCAGTAGAA 900 

| | | II I I I I I I II I I II II I I II I I I I M I II II I I M M I II M I II I I I M M II I II 
841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

| | | | | | | | | || | || | || | | | I I I I I III II M I I M I I I I I I I II M M I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 

961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 



961 iXmcliiUicU^GAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 
1021 AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTG^ 1080 

I I I I I M I II I I I I I I I I I I M I lUiliiiiiij.ili.liii^lii.i 

1021 



AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 108 0 



1081 TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

I I I I I I I I I | | | I I I I I I I I I I I I I I M I 1 I I I I I I I I I I I M II I I I I I I I I I I 

1081 Itctgcttgccattggccatcactgcatttttttatacactaatgacctgtgaaatgttg 

1141 agaaagaaaagtggcatgcagattgctttaaatgatcacctaaagcagagacgggaagtg 
IMMMIIIMIIIIIMIMMIIIMIIMIMIIIIMMIIIIIIMMIIIMI 
1141 agaaagaaaagtggcatgcagattgctttaaatgatcacctaaagcagagacgggaagtg 

1201 gccaaaaccgtcttttgcctggtccttgtctttgccctctgctggcttccccttcacctc 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I M I II I I I I IN 
1201 gccaaaaccgtcttttgcctggtccttgtctttgccctctgctggcttccccttcacctc 



1140 
1140 
1200 
1200 
1260 
1260 
1320 



1261 agcaggattctgaagctcactctttataatcagaatgatcccaatagatgtgaacttttg 
I I l I I I I I I I I I I II I I I I M I I I I I I I I I M II I I I I I I I I I I I M I I I I I I I I I I II I 

1261 AGCA.GGA.TT CT GAAGCT C ACT CT T T AT AAT CAGAAT GAT C C C AAT AGAT GT GAACTTT T G 1320 
1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

I I I I I I 1 I 1 I I t I I I I I M I I I I I I 1 I I 1 I M 1 I I I I I I 1 I I I M I I I I II I I M 

1321 AGCTTTCTGTTGGTA.TTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 1380 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 1440 

I I I I I I I II I M I I I I M I I I I I I I I I I I M I I I I I I I I I M I I I M I I I I M I I I I I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCA 1440 
1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

MIIIIIIIIMIIMIIMIIIIIIIIIIIMIIIIIIIIMIIIIIIIIIIIIIIIII 
1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 

1560 



AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

1560 



1501 AAGTTC ||||||MI I M I I I I I I M I I I I M I I I I I I I I M 

1501 AAGTT CAAAGCT AAT GAT CACGGATAT GACAACTT C CGTT CCAGT AATAAATACAGCT CA 



1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 1620 

f , . ,,,,, , , I I 1 | | I | | I I I I I I I > I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 1 I I I I 
1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATAT^ 1620 

1621 AACAAAATGAA^CATTTGCCAAAACAAAACA^ 1680 

IIIIMMIIIIIMIIIIIMIIMIIMIIIIIIIIIMIIMIIIIIIMMIIMI 

1621 AACAAAATGAAACATTTGCCAAAACA 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 1740 

, .... I I I i I 1 I I I I I I I I I I I I I 1 I I I I I I I M I I I I I I I I I I I 1 1 1 I I I I I I I I 1 I 1 I 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATT 1740 
1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 1800 

, ... | ,., I t I I I I I 1 II I I I I I I I I I I I 1 I I I I 1 I 1 I I I 1 I 1 " I I I I 1 1 1 1 lQOn 

1741 TTACGGCATGGAAAGAAAATC 1800 

1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAA I860 

1 1 < , 1 1 1 1 , 1 1 ii ; 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 



TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

II I I I | | I I I 1 I I I 1 1 I I I 1 I I I I I I I I I I I I I 1 I I I 1 I I I I I I I 1 I 1 I I I I I I I 1 1 I li 
TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

1921 AAT CAAT GGGACT CT GAT AT AAAGG AAGAAT AAGT C ACT GT AAAACAGAACT T T T AAAT G 

MIMII Mill I I M I I I I I I I I I I I M I M I I I I I I II I I IN 

AAT CAAT GGGAC T CT GAT AT AAAGGAAGAAT AAGT C ACT GT AAAACAGAACT T T T AAAT G 



1801 
1861 
1861 



1921 
1981 



AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

, I , | I | I I I I I I I I I I 1 I I I 1 1 I I I I I I I I I I I I I I I I I I 1 I I M I I I I I I M 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

I I I I I I l || | | | I I I I I I I I I I M I I I I I I I I M I I I I I I M I I I I I I M I M I I I I I M 

2041 TAT CAC ACT AT TAT CAGAT T GT AAT T AGAT GCAAAT GAGAGAGCAGT T T AGT T GTT GC AT 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 
I I | | | I I I | | | | | | | I | I I II I I I I I M I I I II II I I I I I M I II I I I I I I M I I I I I I I 
2101 TTTTCGGACACT GGAAACATTTAAATGAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 



1860 

1920 

1920 

1980 

1980 

2040 

2040 

2100 

2100 

2160 

2160 

2220 



I I I I I I I I M II M I M I I I I I M I I I M II I I I M I I I I I I I M I I I I I I I I I I 

2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCA CT GCAATATGTAAC 2220 

2221 CAACAT GTCACAAACAAGCAGCAT GTAACAGACT GGCACAT GTGCCA.GCT GAATTTAAAA 2280 
I , , I | ! | | | 1 | | I 1 I I I I I I I I I I I 1 I I I I I I i I 1 I I I I 1 I I I I ' I " « > 1 1 1 1 1 1 1 1 1 1 1 OOQn 
2221 CAACAT GTCACAAACAAGCAGCAT GTAACAGACT GGCACAT GTGCCAGCT GAATTTAAAA 2280 

2281 TAT AAT ACT T T T AAAAAGAAAAT T ATT AC AT CCT T T ACAT T C AGTT AAGAT C AAACC T C A 2340 
2281 TATAATACTTT ,, | | | | I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

2281 TATAATACTTTTA^ 2340 

2341 CAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGA 2400 

I I I , I I I I I I I I I I I I I | I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I II I I I I I 

2341 CAAAGAGAAATAGAATGTTTGA 2400 
2401 CATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAAATCTTCTTTTT 2460 

I I I II | | | 1 | | I I I I I I I I I 1 I I 1 » I » 1 I I I I 1 I I I 1 I I 1 I 1 I I I I 1 I 1 1 I I I t I 1 I I I I 
2401 CATACCCTGTG^ 2460 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA. 2520 

,,, . . I , I I ! I I I I I I I I I I I 1 I I I I I I I I I I I 1 I I I I I I I I I I 1 M I I I 1 I 1 I I I I I I I 

2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTA^ 2520 
2521 CTGCATGTAGATGATTAAATGAGGGCAGGCCCTGTGCTCATAGCTTTACGATGGAGAGAT 2580 

I I i i I I I i | I i | | | | I I I I I I I I II I I I I M I I I M I I M I I I I I I M I I I I M II II I I 

2521 CTGCAT GTA.GAT GATT AAAT GAGGGCAGGCCCT GT GCTCAT AGCTTT ACGAT GGAGAGAT 2580 
2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGACAAAGGG 2640 

I I I I I I I I I I M I I I I | I I | | M I I I I I I I M I M I I I II I I I I I M I I I I I M I I I I I I 

2581 GCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGAC 2640 



2641 GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 nun n |M i ill 1 



2641 



GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2700 



2700 



GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 2760 



llliM iiiillllllllllllllMIIIMIIIIIIIIIIMM HIM 

2701 GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

2761 T AAAGCT T ATT ACT AAT T T T T GT AT TAT T T T T GT AAAT AGC C AAT AGAAAAGT T T GCT T G 

I I , I I I I I I | | | | | | M I I M I I I II I I I I I M I Ml I I I I I I M I I I 

2761 TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

? 821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 

tmUMMIIMMIIIIIIIIIIMIIIIMIIIIMIIMIIIIIIMIII 

2821 ACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 



2760 
2820 
2820 
2880 
2880 
2940 



7 8 8 1 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 
2881 AGCTTTGT ,,,,,,,,,,, |M | | ,, | || | | | || | | I I M 

2881 AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 2940 

2941 GGGAT GAGAT GT GTGT GAAAGTAT GTACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 3000 

I I | | || | | | | | | | | | | | | | | | I I I I II I II I I I I I M I I I I I I I M I I I I M I I I 

2941 GGGAT GAGAT GT GTGTGAAAGTAT GTACAAGAGAAAACGGAAGAGAGAGGAAAT GAGGT G 3000 

3060 
3060 
3120 



3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

MIIMMIMIIIIIIMIMIIIIMIIIIIMIIMIIIIIMIIMIIIIIIIIII 

3001 GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 



3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 
I I I I I I M I I I | I | | | | I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I 

3061 CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 3120 



3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 

M I , M I I I I I I I I I I I M I I I I I I I I I I I I M I I M I I I I M I I I I I I I I I I I I I I I M 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 
^1 fti ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

M I I I I I I I I I I I I I I I I I I I I I I I I 1 I I 1 I I I IIIIIIIIIIIMMMMI 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 
M I | | | M I I I I I I I I I I I I M I I I I I I I I I M I II I I I M I I I I I I I I I I I I I I M I N 
3241 TTGTTTTCTGT C AAT AT T GAAT GT GAT G GT ACAGT AAAC C AAAAC C C AAC AAT GT GGC C A 



3180 
3180 
3240 
3240 
3300 
3300 
3360 



3301 GAAAGAAAGAGC AAT AAT AATT AATT CAC ACAC CAT AT GGAT T CT AT T TAT AAAT CAC C C 

Tumi linn 1 1 1 m 1 1 m 1 1 m m 1 11 m m m 1 1 m 1 1 1 1 1 1 1 m m m 

3301 GAAAGAAAGAGCAATAATAATTAATTCACACACCATATGGATTCTATTTATAAATCACCC 3360 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

u mi UNI MINI I Ml III I I I I I I I I I 1 I 1M 

3361 ACAAACTTGTTCTTTAATTTCATCCCAATCACTTTTTCAGAGGCCTGTTATCATAGAAGT 3420 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 

7 . I , , , I I I I I I 1 I I I I 1 MIIIIMIMIMIII I M I I II M I M II I 

3421 CATTTTAGACTCTCAATTTTAAATTAATTTTGAATCACTAATATTTTCACAGTTTATTAA 3480 
1 4 ft 1 TATATTTAATTTCTATTTAAATTTTAGATTATTTTTATTACCATGTACTGAATTTTTACA 3540 

I | || I I I I I I I II I I I II II I I I M I I I I I M I I I M I I II II M II II I M I M 

3481 TAT ATT T AAT T T CT AT T T AAAT T TT AG AT TAT T T T TAT T AC CAT GT ACT G AAT TTT T ACA 3540 



Ov 3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

Y I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I 

Db 3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 

Ov 3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 

Y Ml | I I I I I I I I I I | | | I I I I I I I I I I I I M I I I I M I I I M I I I 

Db 3601 T GAAACT AC ACAC AAAAAGCAT ACT T GC AT TAT T T AT AAT AAAAT T GC AT T CAGT G GCTT 

Ov 3661 TTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAAACAATTATAAT 

Y | | | | Ml Ml Ml III III M IN I I I I Illlllllllllllll 

Db 3661 TT T AAAAAAAAT GT TT GAT T CAAAACTT T AACAT ACT GAT AAGT AAGAAAC AATT AT AAT 

Ov 3721 TT CTTTACATACT CAAAACCAAGATAGAAAAAGGT GCTAT CGTT CAACTT CAAAACATGT 

Y || | M | M | | || | | | | I I I I I I I I II II I I II I I I I I I I HUN I I I I I 

TT CTTTACATACT CAAAAC CAAGATAGAAAAAGGT GCTAT CGTT CAACT T CAAAACATGT 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

M M I II II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I II M I I I I I M I I 

TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 

Ov 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 

Y M M I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I M M I I I I I I I I I I I I I 

Db 3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 



Db 3721 



Db 3781 



3600 
3600 
3660 
3660 
3720 
3720 
3780 
3780 
3840 
3840 
3900 
3900 
3960 



Ov 3901 GT GGAT GT AT GT T C AAAC AC CTT T TAGT ATT GAT AGCT T ACAT AT GGCC AAAGGAAT AC A 

Y || || I I I I I I I I I II I II II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 



Db 3901 
Qy 3961 



GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

M I | | M II I II II I I I M II II I II I II I II II I II I I I I I I M M II I M I II 

Db 3961 GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 



Qy 4021 

Db 4021 



AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

I I I II I | M II M I I I M I II I I M I I I M I I I I M M M I I II M I I I I I I I I I 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 
TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 



4081 TTTATTATGTAAGCAAAACCAAl AAAAAXi iflnoi 1 1 j. ± j. inn^nn^itivv* - 

Uy m | | | | | | | | | M I I I I I I II II I I M I I I I I I i M I I I II 



Db 4081 TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

Ov 4141 ACT GT AC AGACACT AAT T CAT T AAAT ACT AAT T GAT T GT T T AAAAGAAAT AT AAAT GT GA 

UY | | | | | | | | || | I II II II I M I I I I I I M I I II I I I I M M II I I I I I I M M I 

Db 4141 ACT GT ACAGAC ACT AAT T CAT T AAAT ACT AAT T GATT GT T T AAAAGAAAT AT AAAT GT GA 

Ov 4201 C AAGT GGAC AT TAT T TAT GT T AAAT AT AC AAT TAT C AAGC AAGT AT GAAGT T ATT C AAT T 

Y | || | M I I I I I I M I I M I I I IN I I I I I I I I I I II I I I I I 

Db 4201 C AAGT GGACAT TAT T TAT GT T AAAT AT ACAAT TAT CAAG CAAGT AT GAAGT TAT T CAAT T 

Qy 4261 AAAAT GCCAC AT TTCTGGTCTCTGGG 4286 

I I II I I I I I I I I I I I I M M I 

Db 4261 AAAAT GCCAC AT TTCTGGTCTCTGGG 4286 



4020 

4020 

4080 

4080 

4140 

4140 

4200 

4200 

4260 

4260 



RESULT 6 

US-10-372-683-48 

; Sequence 48, Application US/10372683 
; Publication No. US2004000917 1A1 



GENERAL INFORMATION: 
APPLICANT: GERRITSEN, MARY E. 
APPLICANT: PEALE JR., FRANKLIN V. 
APPLICANT: WU, THOMAS D. 

TITLE OF INVENTION: METHODS FOR THE TREATMENT OF CARCINOMA 
FILE REFERENCE: P1928R1P1 

CURRENT APPLICATION NUMBER: US/10/372,683 
CURRENT FILING DATE: 2003-02-21 
PRIOR APPLICATION NUMBER: US 10/271,690 
PRIOR FILING DATE: 2002-10-16 
PRIOR APPLICATION NUMBER: US 60/344,534 
PRIOR FILING DATE: 2001-10-18 
NUMBER OF SEQ ID NOS : 49 
SEQ ID NO 48 
LENGTH: 4286 
TYPE: DNA 

ORGANISM: Homo sapien 
US-10-372-683-48 

Query Match 99.6%; Score 4284.4; DB 16; Length 4286; 

Best Local Similarity 100.0%; Pred. No. 0; 

Conservative 0; Mismatches 1; Indels 0, Gaps U, 

GAGACAT T C C GGT GGG G GACT CT GGC C AG C CC GAGC AAC GT GGAT C CT G AGAG C ACT C C C 60 

t I 1 I M I I I I M M I I I I I I I I I M I I M I M I I I I M I I I ! I ! I I I I I I I I M I 

GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 



Ma t" rhes 


4285 


QY 


1 


Db 


1 




61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 



60 
120 



AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

MINIMUM | | | II I II I II I I I II I I I I II I I I I M M II II II M I II II 

AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 
AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 

I I 1 1 I I I I I I I | | I I 1 1 I I I I 1 1 I I I I I 1 1 I I I I I 1 1 I I I 1 1 I I M I I I M I M 

AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 
AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 

) 1 | 1 | I I 1 I I I I M I II I II I II I I I I II M M I I M II I I M I II I I II I 

AACTTGGCTCTGAAACTGCGCAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 2 4 0 



180 
180 
240 



CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

M I I I I I II I I I II II I I M I I M I I I II I I M I I I I I I I I I I MM I I I Ml 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 
TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 

I I I I I I M I I I I I I I I I I I I M I I I I II I I I M I I I I I I I I I I I M I II I I M I I I I I I I 

TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 



ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 

| M II I I I I I I I I M I I I I I Ml I II I I M II II I II 

ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 



300 
300 
360 
360 
420 



CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 

UN | U II I I II I I I I I I M II II M I I I II I II I I I I M I I I I M I II M M I 

CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 
CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 

I I I I I | | | | | | M M I I M M I M M II I I I M I II I M I I M MMMIIMM 



480 



540 



4 8 1 CCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAA 540 
541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 ii 1 1 1 1 1 1 1 1 1L i iiiiiiiiiliiiiiiiiiiiiiiii 

541 " ' — — 

601 



TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 
CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 660 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 'ILiillliiiiliiiliiiiiiiim^r 

601 



CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 6 6 0 



721 



721 



AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

| | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 7 2 0 

CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 8 0 

I 1 | | | | | | | 1 | i I I I 1 I I I 1 1 I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 

CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 



780 



7 8 1 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 8 4 0 

I | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
7 8 1 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 8 4 0 

841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 
MM M I I M ' I I I I I ' I ! I M I ! I I M I I I M I I I I I I I I I I I M I I M M M 



841 

901 



GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 
ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



I M I I I I I I II I I I I I I M M I I I I I II I I 1 [[ [liiiii! [ iiiiii iiiiiiii] 

901 



ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 960 



961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 

| | || || | I I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I I 
961 ATAATTACGATGGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAG 1020 

AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

| || M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I 
AAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTAT 1080 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

MM Mill II I MINI Mill II INN II I " I I I I I I I I I I I ' ' ' 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 



1021 
1021 
1081 
1081 
1141 
1141 



AG AAAGAAAAGT GGC AT GCAGAT T GCTT T AAAT GAT C AC CT AAAGCAGAGACGGGAAGT G 

I I M II I I I II I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I II I I I II M I I I M I I I 

AGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTG 



1140 
1140 
1200 
1200 
1260 



1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

I I M M I I I I I I I M I I I I M II I I I M I I I I I I I M II M II I I I II I II I I I M M 



1201 



GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



1320 



1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 
|| I II II I I II I II I I I I II I I I II I I I I I I I I I M I I I I I M I I I I I M I I I II I II I I 
1261 AGCAGGATTCTGAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTG 1320 



1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

MMMM | M II I II I M I I I I M M II I I I II I I I I M I I I I I M I I I I I I M 

1321 AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



1380 
1380 



1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

| | | M I I M I I M I II I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I 

1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1440 
1440 
1500 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| || I II I I I I II I II II II M I I I II II I I I I I I I I II I I I I I I I I I I I I 

TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 1500 



1441 



1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

| | | | | | | | | | | | I I I II II I I I I I I I I I I I I I M I I II I I I I I I M I II I I I I I I 

1501 AAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAGCTCA 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

| | | | | | | | | | | | | | | | | I I I I I I I I I I I II M I I I I I M I M I M I I I I I I I I I I 

1561 TCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCATTAA 

1621 AAC AAAAT GAAAC ATTT GC C AAAACAAAAC AAAAAACT AT GT ATT T GCAC AGCAC ACT AT 

| | | M | | | | I I I II II I I I I I I I I I I M I I I I I I M I I I I M I I I I I 

162 1 AAC AAAAT GAAAC AT T T G C C AAAACAAAAC AAAAAACT AT GT AT T T G C AC AG C AC ACT AT 

1681 T AAAAT AT T AAGT GT AAT T ATTT T AACACT C AC AG CT AC AT AT GACATT T TAT GAG CT GT 

| | | | | | II I I II I I I I I M I I I I I I I I I I M I I I I I I I I I I M II I M 

1681 TAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 
M | I I I I I I I I I I I I I I I I I M I I I I I I M I I I M I I I I I I I I I M I I I I 

1741 TTACGGCATGGAAAGAAAATCAGTGGGAATTAAGAAAGCCTCGTCGTGAAAGCACTTAAT 
1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

I | | | | | || | | | | || | II I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I M I I N I I I I I 

1801 TTTTTACAGTTAGCACTTCAACATAGCTCTTAACAACTTCCAGGATATTCACACAACACT 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

I | ! M I ! I I I I I I i I I I I I M I l MINIUM I I I I I I M I I I 1 I I i 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 

1921 AAT CAAT GGGACT CTGAT ATAAAGGAAGAATAAGTCACT GTAAAACAGAACTTTT AAAT G 

M M I I M II I M M II I M M I II II II II I M M I I I I I M I I I II I I M II I 

1921 AATCAATGGGACTCTGATATAAAGGAAGAATAAGTCACTGTAAAACAGAACTTTTAAATG 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 
I | || II M II I M I I I M I I I II M I II M II I II II M M II II M I II II M 

1981 AAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTTTCAATTAATAT 

2 041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 
| | || | || | | M I I II II I II I I I M I M I II II II I II II M I I I II M II I I I M II I I 
2041 TATCACACTATTATCAGATTGTAATTAGATGCAAATGAGAGAGCAGTTTAGTTGTTGCAT 

2101 TTTTCGGACACTGGAAACATTTAAATGATCAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 

M M I II II II M I II M I I I M I I I I I I M II II M I I M I II II II I I I M I I II II I 

2101 T T TT CGGACACT GGAAACATTT AAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCTGT 



1560 
1560 
1620 
1620 
1680 
1680 
1740 
1740 
1800 
1800 
1860 
1860 
1920 
1920 
1980 
1980 
2040 
2040 
2100 
2100 
2160 
2160 
2220 



2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 
I | || M I I II I I I I I I M I I I I M I I M I I I M II M II M I M I II I I I II I I M M I I 
2161 TTTTGAAAATCATTACACTTTCACTAGAAGCCCAAACCTCAGCATTCTGCAATATGTAAC 2220 



2221 CAACATGTCACAAACAAGCAGCATGTAACAGACTGGCACATGTGCCAGCTGAATTTAAAA 2280 



2281 



| | | | | | | | M I I I I I M I I I I I I I I I I I II I I I I I I I IN I I I I I 

C AACAT GT C ACAAAC AAG C AG CAT GT AAC AGACT GG C AC AT GT G C C AGC T GAAT T T AAAA 

T AT AAT ACT T T T AAAAAGAAAAT TAT T ACAT C CT T T AC AT T C AGT T AAG AT C AAAC CT C A 2340 

I IMIIIIIIIM I I I I I I I I I I ■ 1 I 1 MIIIMMIIIIIII 

T ATAAT ACT TT TAAAAAGAAAATT ATT ACAT C CTTT ACATT CAGTTAAGAT CAAACCT CA 2340 



2341 CAAAGAGAAAT AGAAT GT T T G AAAGGCT AT CC CAAAAGACT T T T T T GAAT CT GT CAT T CA 

Mill | | | | t I 1 I I I I 1 I I 1 I I I I I I t I i ! I I I I I > 1 I I I 1 I > ■ 1 > ■ 1 1 1 1 1 1 1 1 

2341 CAAAGAGAAAT AGAAT GT T T GAAAGGCT AT CC CAAAAGACT T T T T T GAAT CT GT CAT T C A 

2401 CAT AC C CT GT GAAGACAAT ACT AT CT ACAATTT T TT CAGGAT TAT T AAAAT CTTCTTTTT 

I I M i I I I I I M I I I 11 I M I I I M I M II I I I I II II I I I I M I I Ml 

2401 CAT AC C CT GT GAAGACAAT AC TAT CT ACAATT T T T T CAGGAT T AT T AAAAT CTTCTTTTT 
2461 TCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCTACATACA 
1 | | | I t I I I I 1 I I I I 1 I I I I I 1 I I I 1 I 1 I ■ ■ ■ I 1 I ■ 1 1 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 1 < 

2461 tcactatcgtagcttaaactctgtttggttttgtcatctgta7\atacttacctacataca 

2521 ctgcatgtagatgattaaatgagggcaggccctgtgctcatagctttacgatggagagat 

minium; minium mm nmm mmm 

2521 ctgcatgtagatgattaaatgagggcaggccctgtgctcatagctttacgatggagagat 

2581 gccagtgacctcataataaagactgtgaactgcctggtgcagtgtccacatgacaaaggg 
iimmmimmmmimimimimimmimmmmim MINIM 

25 81 GC CAGT GAC CT C ATAAT AAAGACT GT GAACT G C CT GGT G C AGT GT C C AC AT GACAAAG GG 



GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 

I I I M | N II I I I II II N II I II I I N II II I I II I II N I I M II II N 

GCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATATGTATAAT 



2641 
2641 
2701 
2701 
2761 
2761 
2821 

M I I M M I II M I I I I I I II 1 I I I M 1 1 I I I I I t I I 11 M I I I i I i I 1 I I i 1 

2821 ACAT GGT GCTTTTCTTT CAT CTAGAGGCAAAACTGCTTTTTGAGACCGTAAGAACCTCTT 2880 

2881 



G CT AT AGTT AAAAT ACT AT T T T T C AAAAT C AT ACAG AT T AGT AC AT T T AAC AGCT AC CT G 

| | | | | | II I I I I I I I I M II I I I N I I I I I I I I I N II II I I I I N I I I I I I I I I I I I I I 
GCTATAGTTAAAATACTATTTTTCAAAATCATACAGATTAGTACATTTAACAGCTACCTG 

TAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAAAAGTTTGCTTG 

Ml || || N N I I II N I I I I N II II I N I II N I 11 I II I I N I I I I I I I I I I 

T AAAG CT T AT TACT AAT T T T T GT AT TAT T T TT GTAAAT AG C CAAT AGAAAAGT TT GCT T G 
ACAT GGT GCTTTTCTTT CAT C T AGAGGCAAAACT GCT T TT T GAGAC C GT AAGAAC C T CT T 

N II II II I I N M I II I II I II I II I I 1 N I liilLliliiliilililliiii 



2400 
2400 
2460 
2460 
2520 
2520 
2580 
2580 
2640 
2640 
2700 
2700 
2760 
2760 
2820 
2820 
2880 



2881 
2941 
2941 
3001 
3001 
3061 



AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

|| || | M II II II II I N I I M I M II II II N N II II I N I N N II I II N II I II I 
AGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGGATAGCTT 

G G GAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAG GT G 

I M II II II I II M II I I N M I I I M II II M I I II I I N II II N I I I I I M M I II I 

GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG GAAAT GAGGT G 



2940 
2940 
3000 
3000 
3060 



GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 

M || I II I II II II I I II N I N N II II N II II II II I N NM MM 

GGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTCATTGCCT 3060 
CGTCACATCAATGCAAAAGGTCCTGATTTTGTTCCAGCAAAACACAGTGCAATGTTCTCA 312 0 



3061 C GT CAC AT CAAT G CAAAAGGT C CT GAT T T T GT T C CAGCAAAACACAGT GCAAT GT T CT C A 3120 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

| | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I 

3121 GAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAAAATATGCCCAA 3180 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

M I I I I I I I I I II I I II I I I I I I I I I I M M I I M I I I I I I I I I I I I M I M I I 

3181 ATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATAAGCTAGTAATG 3240 

3241 TTGTTTTCT GT CAAT AT T GAAT GT GAT G GT AC AGT AAAC C AAAAC C C AAC AAT GT G G C C A 3300 

| | | | I | | I M I I I I I I I M I I I I I I I I I I I I M II I I I I I I I I I I I I I I I 

3241 TTGTTTTCTGTCAATATTGAATGTGATGGTACAGTAAACCAAAACCCAACAATGTGGCCA 3300 

3301 GAAAGAAAGAGCAATAATAATTAATT CACACACCATAT GGATT CTATTTATAAAT CACCC 3360 

| | | | | | | | | | | || I I I I I I II M I I I I I I I I I I I I I I M I II I I II I I M I I M I I I I I I 
3301 GAAAGAAAGAGC AAT AAT AAT T AAT T CAC AC AC CAT AT GGAT T CT AT T T AT AAAT CAC C C 3360 

3361 ACAAACT T GT T C T T T AATTT CAT C C CAAT C ACT T T T T CAGAG GCCTGT TAT C AT AGAAGT 3420 

| | | | | | | | | | M I I M I I I I II I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I M M I I 

3361 ACAAACT T GT T CT T TAAT T T CAT C C CAAT C ACT T TT T CAGAG GC CTGT TAT CAT AGAAGT 3420 

3421 CAT T T T AGACT CT CAAT T T TAAATTAAT T T T GAAT C ACT AAT AT T TT CACAGT T TAT TAA 348 0 

| | | | || | | | | I II I I I I I I II II I I I I I I I I mum 

3421 CAT T TT AGACT CT CAAT T T TAAATTAAT T T T GAAT CAC TAAT AT T TT CACAGT TT AT TAA 34 80 

34 81 TAT ATT TAAT T T CT ATT T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT TT T T AC A 354 0 

| | | | | | | | | | || I I I I I I I I I I I I I I I I I I I I I I I I M II I I I II I M M M I M I II II 
3481 TAT AT T TAAT T T CT AT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT T T T T AC A 3540 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

| M I I I I I I I M M M I I I I I I I I I I I I M I M II II II I I I I I I M I I I II I I 

3541 TCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGCCAAATTT 3600 

3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 3660 

Mill I I I I M M M I I M I II I I I I I II I I M I II I I M I I I I I II I I I II I 

3601 TGAAACTACACACAAAAAGCATACTTGCATTATTTATAATAAAATTGCATTCAGTGGCTT 3660 

3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AACAT ACT GAT AAGT AAGAAAC AAT TAT AAT 3720 

M | | || | | | | | M II I II II I II I M M I I I I M II II I I I M II I I I I I I I I I I 

3661 T T T AAAAAAAAT GT T T GAT T C AAAACT T T AACAT ACT GAT AAGT AAGAAAC AAT TAT AAT 372 0 

3721 TTCTTTACATACTCAAAACCAAGATAGAAAAAGGTGCTATCGTTCAACTTCT^AAACATGT 3780 

| | | | | M I I M I I I II I I I I I II I I I II I I I I I M I I M II II I I I M I I I I I M 

3721 T T C T T T AC AT ACT C AAAAC C AAGAT AGAAAAAGGT G CT AT C GT T C AACT T C AAAAC AT GT 378 0 

37 81 TTCCTAGTATTAAGGACTTTAATATAGC7^ACAGACA7\AATTATTGTTAACATGGATGTTA 38 4 0 

| | || | || || || I I I II I I I I I I I I I I I I I I I I I M M M M I I II I II M 

3781 TTCCTAGTATTAAGGACTTTAATATAGCAACAGACAAAATTATTGTTAACATGGATGTTA 3840 

3841 C AGC T CAAAAG AT T T AT AAAAGAT TT T AAC CT AT T T T CT C C CT T AT TAT C CACT GC TAAT 3900 

| | || | | | || | | || | I I I I I II II I I I I M I I M M M II I I I I M I I II II M I II M M 
3841 CAGCTCAAAAGATTTATAAAAGATTTTAACCTATTTTCTCCCTTATTATCCACTGCTAAT 3900 

3901 GTGGATGTATGTTCAAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAATACA 3960 
|| | | | | | I I I I I I I I I I I II I I M I II I II I I II I II I I I II I I I M II II 



Db 


3901 


GTGGATGTATGTTC7VAACACCTTTTAGTATTGATAGCTTACATATGGCCAAAGGAA.TACA 




Qy 


3961 


GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 

| | | M 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 

GTTTATAGCAAAACATGGGTATGCTGTAGCTAACTTTATAAAAGTGTAATATAACAATGT 


*i \j W 


Db 


3961 


A 09 f) 

4 U £. \J 


Qy 


4021 


AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTT 

1 I ; | | | I | I 1 I I 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i ■ 1 1 1 I 1 1 1 MINI 

AAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTACTGATTl 


a rift n 


Db 


4021 


a n r n 


Qy 


4081 


TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 

| | | | I 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TTTATTATGTAAGCAAAACCAATAAAAATTTAAGTTTTTTTAACAACTACCTTATTTTTC 


*± _L *± *J 


Db 


4081 


4 1 4 n 


Qy 


4141 


ACT GTACAGACACT AATT CATTAAATACTAATT GATTGTTTAAAAGAAAT ATAAAT GT GA 

1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 

ACT GT ACAGACACTAATT CATTAAATACTAATT GATT GT 1 1 AAAAbAAAiAiAHAi ^ i ^r/^ 


4 9 on 

ft z u u 


Db 


4141 


4200 


Qy 


4201 


CAAGT GGACATT ATTT AT GTTAAAT AT ACAATT AT CAAGCAAGT AT GAAGTT ATT CAATT 

1 1 I I | M 1 II 1 I 1 II 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

CAAGT GGACAT TATTT AT GTTAAATAT ACAATT AT CAAGCAAGT AT GAAGTT ATT CAATT 


4260 


Db 


4201 


4260 


Qy 


4261 


AAAATGCCACATTTCTGGTCTCTGGG 42 8 6 

1 I 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 
AAAATGCCACATTTCTGGTCTCTGGG 4286 




Db 


4261 





RESULT 7 

US-10-116-802-116 

Sequence 116, Application US/10116802 
Publication No. US20030065157A1 
GENERAL INFORMATION: 
APPLICANT: Amy Lasek 

TITLE OF INVENTION: GENES EXPRESSED IN LUNG CANCER 
FILE REFERENCE: PA-0045 US 

CURRENT APPLICATION NUMBER: US/ 10/ 116, 802 
CURRENT FILING DATE: 2002-04-04 
PRIOR APPLICATION NUMBER: 60/281,593 
PRIOR FILING DATE: 2001-04-04 
NUMBER OF SEQ ID NOS : 519 
SOFTWARE: PERL Program 
SEQ ID NO 116 
LENGTH: 4 305 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/ KEY: misc_f eature 

OTHER INFORMATION: Incyte ID No: 1094000*4 
NAME /KEY: unsure 
LOCATION: 4301-4302 

OTHER INFORMATION: a, t, c, g, or other 
US-10-116-802-116 

Query Match 97.7%; Score 4202.4; DB 13; Length 4305; 

Best Local Similarity 99.4%; Pred. No. 0; 

Matches 4280; Conservative 0; Mismatches 18; Indels 8; Gaps 



60 



1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 

I I | | | | | | | | | | | | I I I I I I I I I I I II I I I I I I I I I M M I I I I I I I I I I I I I M 

1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 



61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 

i in I II I I I I I I I I I I I I I I II II M I I I M I II I I II I I II I 

61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 



120 
120 
180 



12 1 AGGAT CAACAC AGTGGCT GAACACT GGGAAGGAACT GGTACT TGGAGT CTGGACATCT GA 
| | | | | | | | | I I I I I I I I M I I I I I I M I I I I I I I I I M I I I I I I I I I M M I M I I I I I I 
121 AGGAT CAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 

181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

| | | | | | M I I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II 
181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 

241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 



1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 ; I I I M I , 

CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 

Trr,rGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 
| | | || I I I M I M I I I I I I I I I I I I I I I I I I I I I I 



3 0 1 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 3 6 0 
361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 420 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m ii 1 1 1 1 1 1 1 1 [Hi!!!!!!!!!!] 

361 



ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 4 2 0 



421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 48 0 



601 CTTCTGAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCC 

661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 
| | || | | | | | | | | | | I I I I I I I II I I I I I I I I M I I I I M I I I I I I I M I I I I I I 



661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 
721 CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 8 0 

i ii i m 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 MUU!!lllIIIIIIlllIiiiI!' 

721 "~ " """"" mmm 

781 ^-----------^ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

7 8 1 AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 8 4 0 

841 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 900 



CTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAG 7 8 0 
AAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGA 840 



I i i i i i i i I I I I I I I I I 1 I I I I I M I I I I I M I I I I I I I M I I I I I I I I I I I M I 

8 41 GCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAA 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 

| | | | | | | | | | | | | | I M I I I I M I I I I I I I I I I I I I I I I I M I I I I II I II I I I 

901 ATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGAT 



900 
960 
960 



961 
961 
1021 
1021 
1081 
1081 



AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 102 0 

| | | | | I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I 

AT AAT T AC GAT GGACT ACAAAGGAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GT T C AG 



AAGAC AG CT TT CAT GC AGT T T T ACAAGACAGCAAAAGAT TGGTGGCTGTT C AGT TT CT AT 

I I I I II I I II I I I I I M I I M I II I I I I I I M I I I I I I I I I M I I I M I 

AAGACAGCT T T CAT G C AGT T T T AC AAGACAGCAAAAGATT GGT GGCT GT T C AGT TT CT AT 

TTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTG 

| | M I II I I I I II I I II I I I I I I I I I I I I I M I I I I I I Ml 

TTCTGCTTGC CAT T GG C CAT C ACT GC AT T T T T T TAT AC ACT AAT GAC C T GT GAAAT GT T G 



1020 
1080 
1080 
1140 
1140 



1141 AGAAAGAAAAGT GGC AT GC AGAT T GCT T T AAAT GAT CAC CT AAAG C AGAGAC G GGAAGT G 1200 

I | | | | | | | | I I I I I I I I I I M II II I II I I I I I I I I I I I I I M I I I M I M I I I I 

1141 AGAAAGAAAAGT GGC AT GC AG AT T GCT T T AAAT GAT CAC CTAAAG C AGAGAC GGGAAGT G 
1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 

I I M M I I I I I I I I I I I I I I I I I I I I M I M I I I M I I I I I M I M I I I I I I I I I M II I 

1201 GCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTC 



1261 
1261 
1321 
1321 
1381 



AGC AGG AT T CT GAAGCT C ACT CT T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACT T T T G 

I | | | | | || M I I I I I I I I I I I I I I I M I I I I I I I II I M I I I II I I I I I I M I I 

AGCAGGATT CT GAAGCT CACT CTTTATAATCAGAAT GAT CC CAAT AGAT GT GAACTTTT G 

AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 

I | | I II I I I I I I I I I I I I M M I II I I I I I I I I M II I I I I I I I I I I M I I I I I I I I I I I 
AGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATT 



AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 

| | | | | | | | | M | | I I I I I I II I I I I I I I I I I I I I I M I I II I I I M I I I I I I I I I N I I I 
1381 AACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTA 



1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

| i | | | | M | | | | | M II I I M I I I I II I I I I I I I I I I I I I I I I 

1441 TGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTA 

1501 AAGTT CAAAGCTAAT GAT CAC GGAT AT GACAACTT CCGTT CCAGT AAT AAAT ACAGCT CA 

I I I I | | | | | I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I M M I I I I I I M I I I I I 

AAGT T CAAAG CT AAT GAT CAC G GAT AT GAC AACT T C C GT T C C AGT AAT AAAT AC AGCT C A 



1501 
1561 



T CT T GAAAGAAGAACT AT T CACT GT AT T T CAT T TT CT T TAT AT T G GAC C GAAGT CAT T AA 

I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I M I 

1561 T CT T GAAAGAAGAACT AT T CACT GT AT T T C AT TT T CT T TAT AT T GGAC C GAAGTCAT T AA 



1621 AAC AAAAT G AAAC AT T T G C C AAAAC AAAAC AAAAAAC T AT GT AT T T G CAC AG CAC AC TAT 

| | | | | | | | | II I I I I I I I I M I I I M II I I I I I I I I I 

1621 AAC AAAAT GAAAC AT T T G C C AAAAC AAAAC AAAAAACT AT GT AT T T G CAC AGC AC ACT AT 



1200 
1260 
1260 
1320 
1320 
1380 
1380 
1440 
1440 
1500 
1500 
1560 
1560 
1620 
1620 
1680 
1680 



1681 T AAAAT ATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGCTGT 174 0 
Illllll 1 I I I 1 I I I I I M I I 1 I I I I I I I I I I M 1 I 11 I I I I 11 I I 1 I I I 



1681 T AAAAT AT T AAGT GT AAT TAT T T T AACACT C ACAGC T AC AT AT GACAT T T TAT GAG CT GT 1740 

1741 T T ACG GC AT G GAAAGAAAAT C AGT GG GAAT T AAGAAAGC CT C GT C GT GAAAG C ACT T AAT 1800 

I I I I I II I I I I I I I I I I 1 I I M M M I M I I I I I I I I I II I I I I I I 

1741 T T AC G GC AT GGAAAGAAAAT C AGT GG GAAT T AAGAAAGC CT C GT C GT GAAAG C ACT T AAT 1800 

18 01 T T T T T AC AGT T AGCACTT CAAC AT AGC T CT T AACAACTT C C AG GAT AT T C AC ACAAC ACT 1860 

M I I I I I I I I I M I I I M I I I I I I I I MINIM 

1801 TTTTTACAGTT AGCACTT CAACATAGCT CTTAACAACTT CCAGGATATTCACACAACACT 1860 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

| | | | | | | M I I I I I II I I I I I I I I I M I I I I I I I I I I I I M I I I M I I I I I I I I I I M I I 

1861 TAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAGATTTATTTTTA 1920 

1921 AAT CAAT GGGACT C T GAT AT AAAGGAAGAAT AAGT C ACT GTAAAAC AGAACT T T T AAAT G 1980 

| | | | I II I I II I I I I I I I I M I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 
1921 AAT CAAT GGGACT CT GAT AT AAAGGAAGAAT AAGT CACT GTAAAACAGAACTTTTAAAT G 19 80 

1981 AAGCT T AAAT TACT CAAT T T AAAAT T T T AAAAT C CT T T AAAACAACT T T T CAAT T AAT AT 2040 

|| M I I I I I I M I II I I I I I I I I I I I I II M I I I II I I I I I 

1981 7\AGCTTAAATTACTCAATTTAAAATTTTAA7VATCCTTTAAAACAACTTTTC7\ATTAATAT 2040 

2 041 TAT C AC ACT AT TAT C AGAT T GTAATT AG AT GCAAAT GAGAGAGC AGT T T AGT T GT T GC AT 2100 

| | | | | | I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

2041 TAT C AC ACT AT TAT C AGAT T GTAATT AGAT GCAAAT GAGAGAGC AGT T T AGTT GT T G CAT 2100 

2101 TTTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 2160 

| | | | | I I I I I I I I I II I I I II I I I I I M I I I I I I I I I I I M I I I I I I I I I M I I I 

2101 TTTT CGGACACT GGAAACATTTAAAT GAT CAGGAGGGAGTAACAGAAAGAGCAAGGCT GT 2160 

2161 TTTT GAAAAT CAT T ACACT TT C AC - - T AGAAGC C CAAAC CT CAG CAT T - CT GCAAT AT GT 2217 

I | | I I I M II I I I II MINIM I N I I I I M 

2161 TTTT GAAAAT CAT T ACACT TT C AC CT AGAAG C C C CAAAC CT CAG CAT T C CT GCAAT AT GT 2220 

2218 AA- C CAAC AT GT CACAAACAAGC AG — CAT GT AAC AGACT GGCACAT GT G - C C AGCT GAA 2273 

M II I I I I I I I N M M II M I II I I I N I N I II I II I I I N I I I II 

2221 AACCCAACAT GT CACAAACAAGCCAGCCATGTAACAGACT GGCACAT GT GCCCAGCT GAA 2280 

2274 T T T AAAAT AT AAT ACT TTTAAAAAGAAAATT AT T AC AT C C T T T ACAT T C AGT T AAGAT C A 2333 

|| | || N I I I I I I I I I N I N I I I I I I I N N I I I I N II I M I I I N I I I I I 

2281 TTTAAAAT ATAATACTTTTAAAAAGAAAATTATT ACAT C CTTTACATTCAGTTAAGAT CA 2340 

2334 AAC CT C ACAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACTT T T T T GAAT CT G 2393 

| | N I I I II I I N I I I I I II M I N I I N I I I N I N I I I N I I M I M I I I I I 

2341 AAC CT C ACAAAGAGAAAT AGAAT GT T T GAAAGGCT AT C C CAAAAGACT T TTTT GAAT C T G 2400 

2394 T CAT T C AC AT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T TT C AGGAT TAT T AAAAT C T 24 53 

|| I I I I I I I I I I N I I I N I I I II II I I II I I I I I II I I I I N I I I I I 

2401 T CAT T C AC AT AC C CT GT GAAGACAAT ACT AT CT ACAAT T T T T T C AGGAT TAT T AAAAT C T 24 60 

2454 TCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCT 2513 

| | I I || N II N N II I I I I N II II I N II I I I I I I I N I I N N 

2461 TCTTCTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTTACCT 2520 

2514 ACAT AC ACT G CAT GT AGAT GAT T AAAT GAGGG CAGGCC CT GT GC T C AT AG CT T T AC GAT G 2573 

|| I I I II II II I I I I N I I II II I I II II I II N II II I II II II 

2521 ACAT AC AC T G CAT GT AGAT GAT T AAAT GAG GG CAG GCC CT GT GCT C AT AG CT T T AC GAT G 2580 



Qy 


2574 


GAGAGAT GC CAGT GAC CT C AT AAT AAAG ACT GT GAACT G C CT GGT GC AGT GT C C ACAT GA 

1 1 1 1 I I | | I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 11 M 1 1 1 1 M 1 1 1 1 1 I 1 1 

GAGAGATGCCAGTGACCTCATAATAAAGACTGTGAACTGCCTGGTGCAGTGTCCACATGA 


2633 


Db 


2581 


2640 


Qy 


2634 


CAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATAT 

| | | | | | | | | I | | 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 II II II 1 1 1 1 1 

CAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGCATAT 


2693 


Db 


2641 


2700 


Qy 


2694 


GT AT AAT GC TAT AGT T AAAAT AC TAT T T T T C AAAAT CAT AC AGAT T AGT AC ATT T AAC AG 

I | | | | | || IN 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 M 1 

GT ATAAT GCT ATAGTTAAAAT ACT ATTTTT CAAAAT CAT ACAGATT AGT ACATTTAACAG 


2753 


Db 


2701 


2760 


Qy 


2754 


CT AC CT GT AAAGCT T AT TACT AAT T T T T GT AT TAT T T T T GT AAAT AG C CAAT AGAAAAGT 

I | M 1 1 II II | | 1 1 II 1 1 1 1 II II 1 1 M 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 

CT AC CT GT AAAGCT TAT TACT AAT T T T T GT AT TATT T T T GT AAAT AG C CAAT AGAAAAGT 


2813 


Db 


2761 


2820 


Qy 


2814 


TTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGA 

| | | | I 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 M 1 M 

TTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGTAAGA 


2873 


Db 


2821 


2880 


Qy 


2874 


ACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGG 

| | | | | | | | I I 1 1 1 II 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 

ACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCTTAGG 


2933 


Db 


2881 


2940 


Qy 


2934 


AT AG CT T GG GAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAGGAAA 

| M || | | | M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 11 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AT AGCTT GGGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAACGGAAGAGAGAGGAAA 


2993 


Db 


2941 


3000 


Qy 


2994 


TGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTTCGTC 

| | | | | || | || | 1 1 M I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

T GAGGT G G GGT T GGAGGAAAC C CAT G GG GAC AGAT T C CC AT T C T T AGC CT AAC GTT C GT C 


3053 


Db 


3001 


3060 


Qy 


3054 


ATT GCCT C GT C AC AT CAAT G CAAAAG GT C CT GAT T T T GTT C CAG CAAAAC ACAGT GCAAT 

| | | | | | | | I I I I II 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 

AT T GCC T C GT C AC AT CAAT GCAAAAG GT C CT GAT T TT GTT C CAG CAAAAC AC AGT GCAAT 


3113 


Db 


3061 


3120 


Qy 


3114 


GT T CT CAGAGT GACT T T C GAAAT AAAT T GGGC C CAAGAGCT T T AACT C G GT CT T AAAAT A 

| | I I | I I II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 II 

GTT CT CAGAGT GACTTTCGAAATAAATTGGGCCCAAGAGCTTT AACT CGGTCTT AAAAT A 


3173 


Db 


3121 


3180 


Qy 


3174 


TGCCCAAATTTTTACTTTGTTTTTCTTTT AAT AGGCT GGGC CACAT GTT GGAAATAAGCT 

1 | | | || | | I I I I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 II 1 1 1 1 1 1 1 1 M II 

TGCCCAAATTTTTACTTTGTTTTTCTTTTAAT AGGCT GGGCCACAT GTT GGAAATAAGCT 


3233 


Db 


3181 


3240 


Qy 


3234 


AGT AAT GTT GT T T T CT GT CAAT AT T GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAACAAT 

I | M | | | | | I I I M 1 1 1 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

AGT AAT GTT GT T T T CT GT CAAT AT T GAAT GT GAT GGT AC AGT AAAC CAAAAC C CAACAAT 


3293 


Db 


3241 


3300 


uy 




GT G GC C AGAAAGAAAGAGCAAT AAT AAT T AAT T C AC ACAC CAT AT G GAT T CT AT T T AT AA 
| | | 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 
GT G GC C AGAAAGAAAGAGCAAT AAT AAT T AAT T C AC AC AC CAT AT G GAT T CT AT T T AT AA 


3353 


Db 


3301 


3360 


Qy 


3354 


AT C AC C CACAAACT T GT T CT T T AAT T T CAT C C CAAT C AC T T T T T C AGAGG C CT GT TAT C A 

1 I 1 I I I I 1 1 1 M M 1 1 II 1 1 1 II II 1 1 II 1 1 1 II M 1 1 1 1 M 1 1 1 1 1 

AT C AC C CACAAACT T GT T CT T T AAT T T CAT C C CAAT C ACT T T T T C AGAG G C CT GT TAT C A 


3413 


Db 


3361 


3420 



3414 T AGAAGT CAT T T TAG AC T CT CAAT T T T AAATTAAT T T T GAAT C ACT AAT AT T T T C AC AGT 3473 

| | | | | | | | | | | | | | I I I I I 11 I I I I I I M I I I I I I I I I I I I M I I I I I I I II I M I I I I I 
3421 T AG AAGT CAT T T T AGACT C T CAAT T T T AAAT T AAT T T T GAAT C ACT AAT AT T T T C AC AGT 348 0 

3474 T TAT T AAT AT AT T T AAT T T C TAT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT 3533 

| | | | | | | | | | | | | | | I I I I I I I II II I I I I I I I II I I I I II I I I M I I I I I I M I I I M I 
34 81 T TAT T AAT AT AT T T AAT T T CT AT T T AAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT GAAT 3540 

3534 TTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGC 3593 

| | M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I M I M I I I I I I I I I I 
3541 TTTTACATCCTGATACCCTTTCCTTCTCCATGTCAGTATCATGTTCTCTAATTATCTTGC 3600 

3594 C AAAT TT T GAAACT AC AC AC AAAAAG CAT AC T T GC AT TAT T T AT AAT AAAAT T GCAT T C A 3653 

| | | | | | | | I I M II I I I I I I I I M I I I I I I I I I I I I M I I M II I I I I M M I I I I I I I I 
3601 CAAAT T T T GAAAC T AC AC ACAAAAAGCAT ACT T GCAT TAT T TAT AAT AAAAT T GC ATT C A 3660 

3654 GTGGCTTTT T AAAAAAAAT GT TT GAT T CAAAACT T TAAC AT AC T G AT AAGTAAGAAACAA 3713 

| | | | | | | | I I I I I I I I I I II I I I I M I I I I I II I I I I I I I I I I I I M M I I I I I I I I I I 
3661 GTGGCTTTTT - AAAAAAAT GT TT GAT T CAAAACT TTAAC AT ACT GAT AAGT AAGAAACAA 3719 

3714 T TAT AAT T T CT T T AC AT ACT CAAAAC C AAGAT AGAAAAAGGT GCT AT C GT T CAACTT CAA 3773 

| | | | | | I I I I I I I M I I I I I I I I I I I II II I I I I I I I I I I II I I M I I I II I I I I I I I I I 
3720 T TAT AAT T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GT T CAAC TT CAA 3779 

3774 AACAT GT T T C CT AGT AT TAAG GACT T T AAT AT AG CAAC AGACAAAATT ATT GT TAAC AT G 3833 
| | | | | I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I 

37 8 0 AACAT GT T T C CT AGT AT TAAG GAC T T T AAT AT AGCAAC AGACAAAAT TAT T GT TAAC AT G 3839 

3834 GAT GT T AC AGCT CAAAAGAT T T AT AAAAGAT T T T AAC CT AT T T T CT C C CT T AT TAT C CAC 3893 

| | | | | | M I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I M I I I I I I I I I I I 
3840 GAT GT T ACAGCT CAAAAGAT T TAT AAAAGAT T T T AAC CT AT T T T CT C CCT T AT TAT C CAC 3899 

38 94 T GCT AAT GT GGAT GT AT GT T CAAAC AC CT T T T AGT AT T GAT AGCT T AC AT AT GGC CAAAG 3953 

M | | | I I I I I I I M I I II I I I I I I I M I I I II I I I I M I I II II I I I I I I I I M I I I I I I 
3900 T GCT AAT GT GGAT GT AT GTT CAAAC ACCT T T T AGT AT T GAT AGCT T ACAT AT GGC CAAAG 3959 

3954 GAAT ACAGTTTATAGCAAAACAT GGGTAT GCT GTAGCTAACTTT ATAAAAGT GTAAT AT A 4013 

M | | M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I 
3960 GAAT ACAGTTTATAGCAAAACAT GGGTAT GCT GTAGCTAACTTT ATAAAAGT GTAAT AT A 4019 

4014 ACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTA7^AGTGGCTATAGTTA 407 3 

|| || I I I I I I M I I I M I I I I I I I I I M I I I I I I I M I I M 

4 020 ACAATGTAAAAAATTATATATCTGGGAGGATTTTTTGGTTGCCTAAAGTGGCTATAGTTA 4 079 

4074 CT GAT T TT T TAT TAT GT AAGCAAAAC CAAT AAAAAT T T AAGT T T TT T T AACAACT AC CT T 4133 

I M | I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I M I I I I 
4080 CT GATTTTTT AT TAT GT AAGCAAAAC CAAT AAAAATTTAAGTTTTTTTAACAACTACCTT 4139 

4134 ATTTTT CACT GTACAGACACTAAT T CATTAAATACTAATT GATT GTTTAAAAGAAAT ATA 4193 

Mill I I I M M I I I M I II I I I M I I I I I I 

4140 AT T T T T CACT GTACAGACACTAAT T CAT T AAAT ACT AATT GAT T GT T T AAAAGAAAT AT A 4199 

4194 AAT GT GACAAGT GGACAT TATTT AT GTTAAATATACAATTAT CAAGCAAGTAT GAAGTT A 4253 

MINI I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I M I I I I I I 

4200 AAT GT GACAAGT GGACAT TATT T AT GTTAAATATACAATTAT CAAGCAAGTAT GAAGTT A 4259 

4254 TT CAAT T AAAAT GCCACATTTCTGGTCTCTGGGAAAAAAAAAAAAA 42 99 



1 I I I I I I I I I I I I I I I I I I I I I I M M I M I I I I I I 

Db 4260 T T C AAT T AAAAT G C C AC AT T T C T GGT C AAAAAAAAAAAAAGNN AGA 4305 

RESULT 8 

US-10-020-141-5/C 

Sequence 5, Application US/10020141 
Publication No. US20030092013A1 
GENERAL INFORMATION : 
APPLICANT: McCarthy, Jeanette 
APPLICANT: Ableson, Allen 

TITLE OF INVENTION: DIAGNOSIS AND TREATMENT OF VASCULAR DISEASE 
FILE REFERENCE: MMI-002 

CURRENT APPLICATION NUMBER: US/10/020,141 
CURRENT FILING DATE: 2001-12-14 
PRIOR APPLICATION NUMBER: US 60/313,097 
PRIOR FILING DATE: 2001-08-16 
PRIOR APPLICATION NUMBER: US 60/327,4 85 
PRIOR FILING DATE: 2001-10-05 
NUMBER OF SEQ ID NOS : 21 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 5 

LENGTH: 183337 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-020-141-5 

Query Match 66.1%; Score 2841.8; DB 15; Length 183337; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 2854; Conservative 0; Mismatches 2; Indels 1; Gaps 1; 
Qy 1430 AGTCATGCTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGC 1489 

I M | | | M I I I M I I I II I II I I I I M I I I M I I I I I M I I I II I I I I M II M I 

Db 72830 AGT CAT GCT T AT GCTGCTGGT G CC AGT CAT T T GAAGAAAAAC AGTC CT T G GAGGAAAAGC 72771 

Qy 1490 AGTCGTGCTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATA 154 9 

HIM ! , | M. ! I I i I I I I II ! I I : I I I I I I M I M I I ! M I I I i I 1 M I I I I 

Db 72770 AGT C GT GCT TAAAGT T CAAAGCT AAT GAT C AC GGAT AT GACAACT T C C GT T C C AGT AAT A 72711 



Qv 1550 AAT AC AGCT CAT CT T GAAAGAAGAACT ATT CACT GT AT TT C AT T T T CT T TAT AT T GGAC C 

I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I M I Ml 

AAT ACAGCT CAT CT T GAAAGAAGAACT ATT CACT GT AT T T CAT T T T CT T TAT AT T GGAC C 



Db 72710 
Qy 1610 



1609 
72651 
1669 



GAAGT CAT TAAAAC AAAAT GAAACATTT GCCAAAACAAAACAAAAAACT AT GT ATTT GCA 

| | | M | | | | | I II I I I I II I I I I I I I I I M I I I I I I I I I I I M I I I M I I I I I I 

Db 72650 GAAGT CATTAAAACAAAATGAAACATTTGCCAAAACAAAACAAAAAACTAT GTATTTGCA 72591 



1729 



Qy 1670 C AGC AC ACT AT T AAAAT AT T AAGT GT AAT T ATTT TAACACT C AC AG CT AC AT AT GAC AT T 

| | | | | | | | | M I I I I I II I I I M I M I I I I I I I I I I I I I I I I M I I I I I I I I I I M M I I 
Db 72590 C AGC ACACT AT TAAAAT AT T AAGT GT AAT TAT T T TAACACT C ACAGCT ACAT AT GACAT T 72531 



1789 
72471 



1730 t TAT GAG CT GT T T AC G G CAT GG AAAGAAAAT C AGT GG GAAT T AAGAAAG C CT C GT C GT G A 

M | || | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 72530 T TAT GAGCT GT T T AC GGCAT GGAAAGAAAAT CAGT GG GAAT T AAGAAAG CCTCGTCGT GA 

Qv 1790 AAGC ACT T AAT T T T T T ACAGTT AG CAC T T CAAC AT AGC T CT T AAC AAC T T C C AG GAT AT T 1849 

| M | II I I I I I M I II I I I I I II I II II I I I I I I I I I I I M I I I I M I I I I I I I I I M I I 



72470 AAG C ACT T AAT T T T T T AC AGT T AGC ACT T CAACAT AGC T CT T AACAACT T C CAGGAT AT T 72411 



1850 CAC ACAACACT T AG GCT T AAAAAT GAG C T C ACT CAGAAT T T CT AT T CT T T C T AAAAAG AG 1909 

| | | | | | | M | | | | | I I I I I 1 I I I 1 II I I I I I I 1 I I I I I I I I I I I I I I I I I 

72410 CACACAACACTTAGGCTTAAAAATGAGCTCACTCAGAATTTCTATTCTTTCTAAAAAGAG 72351 

1910 ATTTATTTTTAAATCAAT GGGACTCT GATATAAAGGAAGAATAAGTCACTGTAAAACAGA 1969 

| | | | | | | | | | | | | | I II I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 

72350 AT T TAT T T T T AAAT C AAT GGGACT C T GAT AT AAAGGAAGAAT AAGT C ACT GTAAAAC AGA 



72291 
2029 



197 0 ACTTTTAAATGAAGCTTAAATTACTCAATTTAAAATTTTAAAATCCTTTAAAACAACTTT 

| | | | | | | | | I M I I I I I II M I I I I I I I i I I M II I I I II I I I I I I 

72290 ACT T T T AAAT GAAGCT T AAAT TACT CAAT T TAAAAT T T T AAAAT C CT T T AAAACAACT T T 72231 



2030 T CAAT T AAT AT TAT CAC ACT AT TAT C AGAT T GT AAT T AGAT GC AAAT GAGAGAG C AGT T T 

| | || I I I I M I I I I M I I I I I M I I M I I I I I M I I I I I I I I I I I I I I 

72230 T CAAT T AAT AT TAT CAC ACT AT TAT CAGAT T GTAAT T AGAT G CAAAT GAGAGAG C AGT T T 

2 090 AGT T GT T GC AT TT T T C G GAC ACT G GAAAC AT T TAAAT GAT CAGGAGG GAGT AAC AGAAAG 

| | | | | | | | | | | | | | I I I I I I II I I I I I I I M M I I I I I I I I M I I I I I I I I 

72170 AGT T GT T G CAT T T T T C G GAC ACT G GAAACAT T TAAAT GAT CAG GAG G GAGT AACAGAAAG 

2150 AGCAAG GCT GT T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAAC CT C AGCAT T CT G 
| | | | | | | | | || | | I I I II I I I I I II I I I I I I I I M I I I I I I I I I I I I I I M I 1 M I I I I I 
72110 AG C AAG GCT GT T T T T GAAAAT CAT T AC ACT T T CACT AGAAGC C CAAACCT CAG CAT T CT G 

2210 CAAT AT GT AAC CAACAT GT C ACAAACAAGC AGC AT GT AAC AGACT GGCAC AT GT GC C AG C 
| | | | | | | | || | | | | | I I I II I I I I I II I I I I M I I I I M I I M I I I I I I I M I M I I I I I 
CAAT AT GT AAC CAACAT GT C ACAAACAAG C AGC AT GT AAC AGACT GGCAC AT GT GC C AGC 



72050 
2270 
71990 



T GAAT T TAAAAT AT AAT ACT T T T AAAAAGAAAAT TAT T AC AT C CT T T AC AT T C AGT T AAG 

| M I I II M II I I I I I M I I I M I I I I M I I I I I I I I I I I I I I I I I I I I 

T GAATT TAAAAT AT AAT ACT T T T AAAAAGAAAAT T ATT AC AT C CT T T ACAT T CAGT T AAG 



2330 ATCAAACCTCACAAAGAGAAATAGAATGTTTGAAAGGCTATCCCAAAAGACTTTTTTGAA 

! | I I M I I I I I I I I I I I I M I I I I I II I I I I II I I M I I I I I I I I I I I I 

71930 AT CAAAC CT C ACAAAGAGAAAT AGAAT GT T T GAAAG GCT AT C C CAAAAGACTT T T T T GAA 
2390 



2089 
72171 
2149 
72111 
2209 
72051 
2269 
71991 
2329 
71931 
2389 
71871 
2449 



T CT GT CAT T CAC AT AC C CT GT GAAGAC AAT ACT AT CT ACAAT T T T T T CAG GAT T ATT AAA 

I M I I I I I I I I I I I I I I I M I M I I I I I M I I I I I I I I I I I I I I 

71870 TCTGTCATTCACATACCCTGTGAAGACAATACTATCTACAATTTTTTCAGGATTATTAAA 71811 



2450 
71810 

2510 
71750 

2570 



ATCTTCTTTTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 

|| | M | | | || I I II I I I I I I I I I I I I I I I M M II I I I I I I I I I I I I I 

ATCTTCTTCTTTCACTATCGTAGCTTAAACTCTGTTTGGTTTTGTCATCTGTAAATACTT 

AC CT AC AT AC ACT GCAT GTAGAT GAT TAAAT GAGGGCAGGCCCT GT GCT CAT AGCTT T AC 
| | | | | | | | | | | | | | I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
AC CT AC AT AC ACT G CAT GTAGAT GAT TAAAT GAGG GC AG GC C CT GT G CT CAT AGCTT T AC 



GAT GGAGAGAT GC CAGT GAC CT C AT AAT AAAGACT GT GAACT GCCT GGT G CAGT GT C CAC 

I I I I 1 I I | | | | | | | I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I M I I I I M I I I I I 

71690 GAT GGAGAGAT G C CAGT GAC CT CAT AAT AAAGACT GT GAACT G CCT GGT GCAGT GT C CAC 



2509 
71751 
2569 
71691 
2629 
71631 
2689 



2630 ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 

| | | | | | I I I I I II I I I I I M I I I I I I I I I I M I I I I I I I 

71630 ATGACAAAGGGGCAGGTAGCACCCTCTCTCACCCATGCTGTGGTTAAAATGGTTTCTAGC 71571 



Qv 


2690 


Db 


71570 


Ov 


2750 


Db 


71510 


Ov 


2810 


Db 


71450 


Ov 


2870 


Db 


71390 


Ov 


2930 


Db 


71330 


Ov 


2990 


Db 


71270 


Ov 


3050 


Db 


71210 


Ov 


3110 


Db 


71150 


Ov 


3170 


Db 


71090 


Ov 


3230 


Db 


71030 


Ov 


3290 


Db 


70970 


OV 


3350 


Db 


70910 


Qy 


3410 


Db 


70850 


Qy 


3470 


Db 


70790 



AT AT GT AT AAT GCT AT AGT T AAAAT AC TAT T T T T C AAAAT C AT AC AGAT T AGT ACAT T T A 2749 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I M I I I I I 

AT AT GT AT AAT G CT AT AGTT AAAAT ACT AT T T T T CAAAAT CAT ACAGAT T AGT AC ATT T A 71511 

AC AGCT AC C T GT AAAG C T TAT T AC T AAT T T T T GT AT T ATT T T T GT AAAT AG C CAAT AGAA 28 09 
| | I I I I I I I I I M I I I I II I I I 1 I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I 
ACAGCTACCTGTAAAGCTTATTACTAATTTTTGTATTATTTTTGTAAATAGCCAATAGAA 71451 

AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 2869 

| | | | M I I I I I I I || I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I 
AAGTTTGCTTGACATGGTGCTTTTCTTTCATCTAGAGGCAAAACTGCTTTTTGAGACCGT 71391 

AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 2929 

| I I I M I I I II I M I I I I I II II I I I I I I I I I I I I I I I II I I I I I I I I M I I II I I I I I I 
AAGAACCTCTTAGCTTTGTGCGTTCCTGCCTAATTTTTATATCTTCTAAGCAAAGTGCCT 71331 

TAG GAT AGCT T G GGAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC G GAAGAGAGAG 2989 

I M I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I 

T AGGAT AGC T T GG GAT GAGAT GT GT GT GAAAGT AT GT ACAAGAGAAAAC GGAAGAGAGAG 71271 

GAAATGAGGTGGGGTTGGAGGAAACCCATGGGGACAGATTCCCATTCTTAGCCTAACGTT 3049 

I I I I I I I I I II I I I I I I I I I I II I I I I I I II I M I I I I I I I I I I II I I II I I I I I I I II I 

GAAAT GAGGT GGGGT T G GAG GAAAC C CAT G G GGAC AGAT T C C CAT T CT T AGC CT AACGT T 71211 

C GT CAT T G C CT C GT CACAT CAAT G C AAAAGGT C CT GAT TT T GT T C C AG CAAAAC ACAGT G 3109 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

C GT CAT TGCCTCGT CACAT CAAT G C AAAAG GT C CT GAT T T T GT T C CAGCAAAACACAGT G 71151 

CAATGTTCTCAGAGTGACTTTCGAAATAAATTGGGCCCAAGAGCTTTAACTCGGTCTTAA 3169 

I I I I I I I I I I I I M I I II I II I I I I I I I I I I I I I I I I I I I I I I II II I II I I I I I I I I II 

CAAT GT T C T C AGAGT GACTT T C GAAAT AAAT T GG G C C CAAGAGCT T TAACT C GGT CTTAA 71091 

AATATGCCCT^AATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 3229 

I | I I I M M I I I I I I I I I I M I I I I I I I I I I I M I I I II I I I II I I I I I I I I I I I M I I I 
AATATGCCCAAATTTTTACTTTGTTTTTCTTTTAATAGGCTGGGCCACATGTTGGAAATA 71031 

AGCT AGTAAT GTTGTTTTCTGT CAAT AT T GAAT GT GAT G GT AC AGTAAAC CAAAAC C CAA 3289 

I || | I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

AGCT AGT AAT GT T GTT T TCTGT CAAT ATT GAAT GT GAT GGT ACAGT AAAC CAAAAC C CAA 7 0971 
CAAT GT GG C CAGAAAGAAAGAGCAAT AAT AAT TAAT T CACAC AC CAT AT G GAT T CT AT T T 334 9 

I | | II I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I I I 

CAAT GT GG C CAGAAAGAAAGAGCAAT AAT AATT AAT T CACAC AC CAT AT GGAT T CT ATT T 7 0911 
AT AAAT C AC C CACAAACT T GT T CT T TAAT T T CAT C C CAAT C ACTT T T T C AGAGGC CT GT T 3409 

I M I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I M 

AT AAAT CAC C CACAAACT T GT T CT T TAAT T T CAT C C CAAT C ACT T T T T C AGAGGC CT GTT 70851 
AT C AT AGAAGT C ATT T T AGACT CT CAAT T T T AAAT TAAT T T T GAAT C ACT AAT AT TT T C A 34 69 

I I I M I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AT CAT AGAAGT CAT T T T AGACT CT CAAT T T TAAAT TAAT T T T GAAT CAC TAAT AT T T T C A 70791 
C AGT T TAT TAAT AT AT T TAAT T T C T AT T TAAAT T T T AGAT TAT T TT T AT T AC CAT GT AC T 352 9 

| || | I II I I I I I I I I II I I I I I I I 1 I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I 

C AGT T TAT TAAT AT AT T TAAT T T CT AT T TAAAT T T T AGAT TAT T T T TAT T AC CAT GT ACT 7 0731 



Qy 3530 GAAT T T T T AC AT C C T GAT AC CCTTTCCTTCTC CAT GT C AGT AT CAT GT T CT CT AAT TAT C 358 9 

| | M I I I I I I I I I I I I I I I I I I I I I M I I M I I I M I I I I I I I I I I I I I I I I I I I I M I 
Db 70730 GAAT T T T T ACAT C CT GAT AC CTT T TCCTTCTC CAT GT C AGT ATC AT GT T CT CT AAT TAT C 70671 

Qy 3590 T T GC CAAAT T T T GAAACT AC ACAC AAAAAG CAT AC T T G CAT TAT T T ATAAT AAAAT T GC A 3649 

| | | M I I i I I I I I I I I I I I I I I M I I I II I M I I II I I I I I I I I I I I I I II I I I I I I I I I 
Db 70670 T T GC CAAAT T T T GAAAC T ACACACAAAAAGC AT ACT T GCAT TAT T T AT AAT AAAATT GC A 70611 

Qy 3650 TTCAGTGGCTTTTTAAAAAAAATGTTTGATTCAAAACTTTAACATACTGATAAGTAAGAA 3709 

| || || | | | | | | | I I I I I I I I I I I I I I I I I I M I I I II I I M I I I I I I I I I I I I M I I I I 
Db 70610 T T CAGT GGCTTTTT - AAAAAAAT GT T T GAT T CAAAACT T T AAC AT ACT GAT AAGT AAGAA 70552 

Qy 3710 ACAAT TAT AAT T T CT T T AC AT ACT C AAAAC CAAGAT AGAAAAAG GT G CT AT C GTT CAACT 3769 

| | | I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I 
Db 70551 ACAAT TAT AAT T T CT T T AC AT ACT CAAAAC CAAGAT AGAAAAAGGT GCT AT C GTT CAAC T 70492 

Qy 3770 T CAAAAC AT GT T T C CT AGT AT T AAGGAC T T TAAT AT AG CAACAGACAAAATT ATT GTT AA 3829 

I | I | I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 70491 T CAAAACAT GTTT C CT AGTATTAAGGACTTTAATATAGCAACAGACAAAATTATT GTTAA 70432 

Qy 3830 CAT GGAT GT T AC AGCT CAAAAGAT T T ATAAAAGAT T T T AAC CT AT T TTCT CC CTT ATT AT 3889 

| | M I I II I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I II M 
Db 704 31 CAT GGAT GT TACAG CT CAAAAGAT T T AT AAAAGAT T T T AAC C TAT T T T CT CC CTT AT TAT 7 0372 

Qy 38 90 C C ACT GCTAAT GT G GAT GT AT GT T C AAACAC CT T T T AGT AT T GAT AGCT T AC AT AT G GC C 394 9 

| | M II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I 
Db 70371 C CACT GC T AAT GT GGAT GT AT GT T C AAACAC CT T T T AGT AT T GAT AGCT TAC AT AT GG C C 70312 

Qy 3950 AAAGGAAT ACAGTTT AT AGC AAAACAT GGGT AT GCT GT AGCT AACTTT AT AAAAGT GT AA 4009 

I | | | | | | I I I I I I I I I I I I I I I I I I II I I I I I I II I I I M I M I I I I I M I M I I I M I I 
Db 70311 AAAGGAAT ACAGTTT AT AGCAAAAC AT GGGTAT GCT GT AGCT AACTTT AT AAAAGT GTAA 70252 

Qy 4010 T AT AACAAT GT AAAAAAT TAT AT AT CT GGGAGGAT T T T T T GGT T GC CT AAAGT GGCT AT A 4069 

| M | | | | || I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I M I I I 
Db 70251 TAT AACAAT GT AAAAAAT TAT AT AT CT GGGAGGAT TTTTTGGTT GC CT AAAGT GGCT AT A 70192 

Qy 4 070 GTT ACT GAT TT T TT AT TAT GTAAGC AAAAC CAAT AAAAAT T TAAGT T T TT T T AACAACT A 4129 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I M I I I M I M I II I I I I I I I I I 

Db 70191 GT TAC T GATTT T T TAT TAT GTAAGC AAAAC CAAT AAAAAT T TAAGT TT T T TT AACAACT A 70132 

Qy 4130 C CT T AT TT T T CACT GT AC AGAC AC TAAT T CAT T AAAT ACT AAT T GAT T GTT TAAAAGAAA 418 9 

| | M M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 70131 CCT T AT T T T T CACT GT AC AGAC AC TAAT T CAT T AAAT ACT AAT T GAT T GT T TAAAAGAAA 70072 

Qy 4190 TAT AAAT GT GACAAGT G GAC AT TAT T T AT GT T AAAT AT AC AAT TAT CAAG CAAGT AT GAA 4249 

M I I I I I I I I I I M I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I M I I I I I M 
Db 70071 TAT AAAT GT GACAAGT GGAC AT TAT T TAT GT T AAAT AT ACAAT TAT CAAGCAAGT AT GAA 70012 

Qy 4250 GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 4286 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 70011 GTTATTCAATTAAAATGCCACATTTCTGGTCTCTGGG 69975 



RESULT 9 

US-10-116-802-117 

; Sequence 117, Application US/10116802 
; Publication No. US20030065157A1 



GENERAL INFORMATION: 
APPLICANT: Amy Lasek 

TITLE OF INVENTION: GENES EXPRESSED IN LUNG CANCER 
FILE REFERENCE: PA- 004 5 US 

CURRENT APPLICATION NUMBER: US/ 10/ 1 1 6, 8 02 
CURRENT FILING DATE: 2002-04-04 
PRIOR APPLICATION NUMBER: 60/281,593 
PRIOR FILING DATE: 2001-04-04 
NUMBER OF SEQ ID NOS : 519 
SOFTWARE: PERL Program 
SEQ ID NO 117 
LENGTH: 1892 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: mis cofeature 

OTHER INFORMATION: Incyte ID No: 1094000.5 
US-10-116-802-117 

Query Match 39.2%; Score 1684.6; DB 13; Length 1892; 

Best Local Similarity 99.7%; Pred. No. 0; 

Matches 1698; Conservative 0; Mismatches 4; Indels 1; Gaps 1; 

Qy 178 TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 237 

M II I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

Db 190 TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 24 9 

Qy 238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 297 

| | M I I I I I I I I I II I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 250 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 309 

Qy 298 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 357 

| | | | | I I I I M I I I I I I I I I M M I I I I I I I I I I I I M I I M I I I I II I I I I I I I M I I I 
Db 310 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 369 

Qy 358 CAAAC C GC AGAGAT AAT GAC G C C AC C C ACT AAGAC CT T AT GGC C CAAGGGT T C CAAC GC C 417 

| | | | | | | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 370 CAAAC C GC AGAGAT AAT GAC G C C AC C C ACT AAGACCTT AT GGC C CAAGGGT T C CAAC G C C 429 

Qy 418 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 477 

|| M || | | II I I I I I I I I I I I I I M II I I I I I I I I I I I I M I II I I II I I M I I I M I I I 

Db 430 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 489 

Q y 478 C C GC C AC GCAC CAT CTCCCCTCCCC C GT GC CAAGGAC CC AT C GAGAT CAAGGAGACT T T C 537 

I M | I I I I I I I I I I I I I I II II II M I M II II I I I I I I I I I I I I I I I II I I I I I I I I M 

Db 490 C C GC C AC GCAC CAT CTCCCCTCCCC C GT GC CAAG GAC CC AT C GAGAT CAAGGAGACT T T C 54 9 

Qy 538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 

I I | | I I I I I I I || I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I II I I I I I I I I I I M 
Db 550 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 609 

Qy 598 ACACTT CT GAGAATTAT CT ACAAGAACAAGT GCAT GCGAAACGGT CCCAAT AT CT T GAT C 657 

I I | | I | | I I II I I I I I I I I M I II M II I I I I I I I M I I I I I I I I I I I I I I 

Db 610 AC ACT T CT GAGAAT T AT CT AC AAGAACAAGT GCAT G C GAAAC GGT C C CAAT AT CT T GAT C 669 

Qy 658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

|| || || | | || || M I I II I M I I I I I I I I I M I I I I M I I I I I I I I M I I I I I M I I I M 



670 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 729 

718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

| 1 | | | | | | | | | | I I I I I M I I 1 I I I I I I I M I I I I I I M I I I I I I I I I 1 I I I I I M 1 I I I 
730 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 7 89 

778 CAGAAAGC CT CCGT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGT ATT GACAGATAT 8 37 
| | | M | | | I | I I I I I I i I I I I I I I I I I I M II I I I I I I I I I I M I I I M I I I I I M I I I I 

7 90 CAGAAAGCCTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATAT 84 9 

838 CGAGCTGTTGCTTCTTGGAGTAGT^ATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 8 97 

| | | | | | | | | | I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I 
850 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 909 

8 98 G7WVTTGTTTTGATTTGGGTGGTCTCTGTGGTTCT-GGCTGTCCCTGAAGCCATAGGTTT 956 

| | | | I M I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
910 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGGCTGTCCCTGAAGCCATAGGTTT 969 

957 T GAT AT AAT T AC GAT GGACT ACAAAG GAAGT TAT CT G C GAAT CT GCTTGCTT CAT C C C GT 1016 

| | | | | | | | | | I I I I I II I II I I I I II I II I I I I I M II I I I I I I I M I I I I I I I I I I I I I 
970 T GAT AT AAT T AC GAT GGACT ACAAAGGAAGTT AT CT GC GAAT CTGCTTGCTT CAT C C C GT 102 9 

1017 T C AGAAGAC AGCT T T C AT GCAGT T TT ACAAGACAG CAAAAGAT T GGT G G CT GT T C AGT T T 1076 

| | I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I II I II I 

1030 T C AGAAGAC AGC TT T CAT G CAGT T T T ACAAGACAGCAAAAGAT T GGT GG C T GT T C AGT T T 1089 

1077 CTAT TTCTGCTTGC CAT T GG C CAT CACT GC AT t T T T T T AT ACACTAAT GAC CT GT GAAAT 1136 

| | | | | | | | I I I I I I I I I I I II I I I I I I I I I I I M I M I I I I I I I I I I M I I I I I I I I I I I 
1090 CTAT T T C T GCT T GC CATT GG C CAT CACT GCAT T T T T T T AT ACAC T AAT GACCT GT GAAAT 1149 

1137 GT T GAGAAAGAAAAGT GGC AT G CAGAT T GCT T T AAAT GAT C AC CT AAAG CAGAGAC GG GA 1196 

|| | | | | | | I I I I I I I I I I II I I II I I I I I I II II I I I I I I I I I I I I M I I I I I I II M M 
1150 GTTGAGAAAGAAAAGTGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGA 1209 

1197 AGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCA 1256 

| | | | I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I M I M I I I I II I I M I I I 

1210 AGTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCA 1269 

1257 CCT CAGCAGGATT CT GAAGCT CACT CTTT ATAAT CAGAAT GAT CCCAATAGAT GT GAACT 1316 

| | | I I I I I I I M I I I I I I I I I M I I M II I I I I II I I I I I I I I M I II I I I M I 

1270 CCT CAGCAGGAT T C T GAAGCT CACT CT T TAT AAT CAGAAT GAT C CCAAT AGAT GT GAACT 1329 

1317 T T T GAG CT T T CT GT T GGT AT T G GACT AT AT T GGT AT CAAC AT GG CT T CACT GAAT T C CT G 1376 

| | | | | | | | I || I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N 
1330 TTTGAGCTTTCTGTTGGTATT GGACT AT ATT GGT AT CAAC AT GGCTT CACT GAATT CCT G 1389 

1377 CATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATG 14 36 

| | | | | | | | | | | || I I I I I II I I I I I I I I I I I I I I I I I I M I I I M I I I M I I I M I I I M 
1390 CATTAACCCAATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATG 1449 

1437 CTTAT GCT GCT GGT GC CAGT CATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTG 1496 

II | | I I I I I I I I I I I I I I I II I I I I I M II I I I I I I I I I I I I I I I I 

1450 CTTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTG 1509 

1497 CT T AAAGT T CAAAGC T AAT GAT CAC G GAT AT GACAAC TTCCGTTC CAGT AAT AAAT AC AG 1556 

| | | | | | | | | | I I I I I I I I I I I I I II I I II I I I I I I I I I I I M I I I I I I I I I I I I I I M I I 
1510 CTTAAAGTTCAAAGCTAATGATCACGGATATGACAACTTCCGTTCCAGTAATAAATACAG 1569 



Qv 


1557 


Db 


1570 


Qy 


1617 


Db 


1630 


Qv 


1677 


Db 


1690 


Qv 


1737 


Db 


1750 


Qy 


1797 


Db 


1810 


Qy 


1857 


Db 


1870 



CTGATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCA 1616 

I I I I I I 1 I I I I i I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I 1 I I I I I I I I 

CT CAT C T T GAAAGAAGAACT AT T C AC T GT AT T T CAT T T T CT T TAT AT T GGAC C GAAGT C A 162 9 

T T AAAACAAAAT GAAACAT T T GC CAAAACAAAACAAAAAACT AT GT AT T T G C AC AGC AC A 167 6 

M | I I I I I I I I I I I 1 I I I I I I I I I I I II I I I I I I I I II I I I I I I M I I I II I I I I I I I M 

T T AAAACAAAAT GAAACAT T T G C CAAAACAAAACAAAAAACT AT GT AT T T G C AC AGC AC A 1689 

CTATTAAAATATT7\AGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAG 1736 

M | I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I II I I I I I I I I I I I I I I I I M I I I 

CT AT T AAAAT AT T AAGT GT AAT TAT T T T AAC ACT C AC AGCT AC AT AT GAC AT T T TAT GAG 174 9 

CT GT T T AC GGCAT GGAAAGAAAAT C AGT GGGAAT T AAGAAAGCCT C GT C GT GAAAGC ACT 1796 

I | | | II I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I i I M 

CT GT T T AC GGCAT GGAAAGAAAAT C AGT GGGAAT T AAGAAAG CCT C GT C GT GAAAGC ACT 1809 

T AAT T T T T T ACAGT T AGCACT T CAACAT AG CT C T T AACAACT T C CAGGAT AT T C AC ACAA 1856 
I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I II II I I II I I I I II I I I I I I I I I I I I 
T AAT T T T T T AC AGT TAG CACT T CAACAT AGCT CT T AAC AAC T T C CAGGAT AT T C ACAC AA 1869 

CACTTAGGCTTAAAAATGAGCTC 1879 

I I I I I I I I I I I I I I M I I I I I I I 
CACTTAGGCTTAAAAATGAGCTC 18 92 



RESULT 10 
US-10-116-802-118 

; Sequence 118, Application US/10116802 
; Publication No. US20030065157A1 
; GENERAL INFORMATION: 
; APPLICANT: Amy Lasek 

; TITLE OF INVENTION: GENES EXPRESSED IN LUNG CANCER 
; FILE REFERENCE: PA- 0045 US 

; CURRENT APPLICATION NUMBER: US/10/116,802 

; CURRENT FILING DATE: 2002-04-04 

; PRIOR APPLICATION NUMBER: 60/281,593 

; PRIOR FILING DATE: 2001-04-04 

; NUMBER OF SEQ ID NOS : 519 

; SOFTWARE: PERL Program 

; SEQ ID NO 118 

; LENGTH: 18 77 

; TYPE: DNA 

; ORGANISM: Homo sapiens 

FEATURE : 
; NAME/ KEY: misc_feature 

OTHER INFORMATION: Incyte ID No: 1222734CB1 
US-10-116-802-118 

Query Match 39.0%; Score 1676.6; DB 13; Length 1877; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1679; Conservative 0; Mismatches 4; Indels 0; Gaps 0 

Qy 178 TGAAACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 237 

|| || | | | | || I || I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I M M I I I M I 
Db 190 TGTCTCTAGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGC 249 



238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 2 97 

| | | | | M | | | | 1 M I i | I I I I I I I I I i I I I I II I II I I I I M I I I I I I I I I I M I I I I I I 
250 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 30 9 

2 98 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 357 

|| | | | | | | | | I I I I I I I II I I I I I I I I I I I I I I II I I I M I I I I I I I t I I M I II I I I II 
310 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 369 

358 CAAAC C GCAGAGAT AAT GAC G C C AC C C ACT AAGAC C T TAT GGC C CAAGG GT T C CAACG C C 417 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
370 CAAAC C GCAGAGAT AAT GAC GCC AC C CACT AAGAC CT T AT G GC C CAAGGGT T C CAACGC C 429 

418 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 477 

I | | | || | I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I M II I I I I I I 
430 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 489 

47 8 C C GC CAC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C G AGAT CAAGGAGACTT T C 537 

I | M I | | I M I II I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I M I I I 

490 C C GC CAC G CAC CAT CTCCCCTCCCCCGT GC CAAGGAC C CAT C GAGAT CAAGGAGACT T T C 549 

538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 

| | | I I I M I I I I I I I M I I I I I I I I I I I I I I I II I I I I I II I I I I I II I I I I II I I I I M 
550 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 609 

598 AC ACT T CT GAGAAT TAT CT ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT CT T GAT C 657 

| | | | || || I I I M I I I I I I I I I I I I I II M I I I II II I I I I I I I I I I I I I M I I I I I I I I 
610 AC ACT T CT GAGAAT TAT CT ACAAGAACAAGT G CAT GC GAAAC GGT C C CAAT AT C T T GAT C 669 

658 GCCAGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTAC 717 

I I I I I I M I I I I! II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
670 GC C AG CT T GGCT CT GG GAGAC C T G CT G C ACAT C GT CAT T GAC AT CC CTAT CAAT GT CT AC 72 9 

718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

| | M I I I I I I I II I I I I I I I I M I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I M I 
730 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 789 

778 C AGAAAGC C T C CGT GGGAAT CACT GT GC T GAGT CTAT GT GCT CT GAGT AT T GAC AGAT AT 837 

| | | | | | | | | | | I M I I I I I I I I I I I I I I I I M II I I I I I I I I I I I M I I I I I M I I M I I 
790 C AGAAAG C CT C C GT GGGAAT CACT GT GCT GAGT CTAT GT GCT CT GAGT AT T GAC AGAT AT 849 

838 CGAGC T GTTGCTTCTT GGAGT AGAATT AAAG GAAT T GG GGT T C CAAAAT GGACAG CAGT A 8 97 

I | | | | I || I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I M I I I I I I M I 
8 50 CGAGCTGTTGCTTCTTGGAGTAGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTA 90 9 

8 98 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

| | M I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
910 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 969 

958 GAT AT AAT T AC GAT GGACTACAAAG GAAGT TAT CT GC GAAT CT GCTT GCT T CAT C CC GTT 1017 

| | | | || | I I | I I I || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I 
97 0 GAT AT AAT T AC GAT GG ACT ACAAAGGAAGTT AT CT GC GAAT CTGCTTGCTT CAT C C C GT T 102 9 

1018 CAGAAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTC 1077 

| | | | I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M 
1030 CAGAAGACAGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTC 1089 

107 8 TATTTCTGCTTGCCATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATG 1137 



1090 TAT TTCTGCTTGC CAT T GGC C AT C ACT G CAT T T T T T TAT AC AC T AAT GAC CT GT GAAAT G 1149 

1138 T T GAGAAAGAAAAGT GGCAT GCAGAT T GCT T T AAAT GAT CAC CT AAAGCAGAGAC G GGAA 1197 

I I I M I I I I II I I I I I I i I I I I I I I I I I I 1 I I I I I i I I I I I I I I 

1150 T T GAGAAAGAAAAGT GGCAT GCAGAT T G CT T T AAAT GAT CAC CT AAAGCAGAGAC GGGAA 1209 

1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1257 

I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I 11 I I I I I I 

1210 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1269 

1258 CT C AGC AGGAT T CT GAAGCT C ACT C T T TAT AAT C AGAAT GAT C C CAAT AGAT GT GAACTT 1317 

I I I I I I I I I I I I 1 I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
127 0 CT C AG C AG GAT T CT GAAGCT C ACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T 132 9 

1318 T T GAGCT T T CT GT T G GT ATT G GACT AT AT T GGT AT CAAC AT GG CT T C ACT GAATT C CT GC 1377 

I I M I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M 
1330 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACT GAATT CCTGC 138 9 

137 8 AT T AAC C CAAT T GCT CT GT AT T T GGT GAGCAAAAGAT T CAAAAACT G CT T T AAGT CAT GC 1437 

I I I M I I I I I M I I I I I II I I I I I I M I I M I I I M I I I I I I I I I I I I I I I I I I I I I I M 
1390 AT TAAC C CAAT T GCT CT GT AT T T GGT GAGCAAAAGAT T CAAAAACT GCT T TAAGT CAT GC 14 4 9 

1438 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 1497 

I | | I I I I I I II I I I I I I M I I I I I I II I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I 
1450 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 1509 

1498 T T AAAGT T CAAAGCT AAT GAT C AC GGAT AT GACAACT T C C GT T C CAGT AAT AAAT AC AGC 1557 

I I I I I I I I I I I I I I I M II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1510 T T AAAGT T CAAAGCT AAT GAT CAC GGAT AT GACAACT T C C GT T C C AGTAAT AAATAC AGC 1569 

1558 TCATCTTGAAAGAAGAACTATTCACTGTATTTCATTTTCTTTATATTGGACCGAAGTCAT 1617 

I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I 
1570 T CAT CT T GAAAGAAGAACT AT T CACT GT ATT T CAT T T T CT T TAT AT T GGAC C GAAGT CAT 1629 

1618 TAAAACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACT AT GT ATTT GCACAGCACAC 1677 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I II I I I I I I I I II I 
1630 TAAAACAAAAT GAAACATTT GC CAAAACAAAACAAAAAACTAT GT AT TT GCACAGCACAC 168 9 

1678 TATTAAAATATTAAGTGTAATTATTTTAACACTCACAGCTACATATGACATTTTATGAGC 1737 

I | | | | | I I I I I I I I I I I II I I I I I I I II I I II I I I I I M I I I M M I M I I I I I I 

1690 TAT T AAAAT AT TAAGT GT AAT TAT T T TAAC ACT CAC AG CT AC AT AT GAC AT T T TAT GAGC 1749 

1738 T GT TT ACGGC AT G GAAAGAAAAT CAGT GGGAAT T AAGAAAGCCT CGT C GT GAAAGCACT T 1797 

M I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I M I I I II II I I I I I I I I I I I I I I 
1750 T GT TT AC GG CAT GGAAAGAAAAT CAGT G GGAAT TAAGAAAGC CT C GT C GT GAAAG C ACT T 1809 

17 98 AAT TT T TT ACAGT T AG CACT T CAAC AT AG CT CT T AACAACT T C C AGGAT AT T CACAC AAC 1857 

I M I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I M I I I I M I I I I 
1810 AAT TT T T T AC AGT T AGC AC T T CAACAT AGCT CT T AACAACT T C CAG GAT AT T C ACACAAC 1869 

1858 ACT 1860 
I I I 

1870 ACT 1872 



US-10-305-720-1203 

Sequence 1203, Application US/10305720 
Publication No. US20040010136A1 
GENERAL INFORMATION: 
APPLICANT: Au-Young, Janice K.; Seilhamer, Jeffrey J. 

TITLE OF INVENTION: Composition for the Detection of Signaling Pathway Gene 
Expression 

FILE REFERENCE: PA-0002-1 CON 
CURRENT APPLICATION NUMBER: US/ 10/ 3 05 , 72 0 
CURRENT FILING DATE: 2002-11-26 
PRIOR APPLICATION NUMBER: 09/016,434 
PRIOR FILING DATE: 1998-01-30 
NUMBER OF SEQ ID NOS : 1490 
SOFTWARE: PERL Program 
SEQ ID NO 1203 
LENGTH: 147 0 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/KEY: misc_feature 

OTHER INFORMATION: GenBank ID No. US2004 0010136A1 gl82275 
US-10-305-720-1203 

Query Match 34.1%; Score 1466.8; DB 16; Length 1470; 

Best Local Similarity 99.9%; Pred. No. 8.3e-291; 

Matches 14 68; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

GAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCC 251 

I I M I I I I I I I II I I I I I I I I M I I I I I I I I I I I I II I I I 1 I I I I I I M I I I M I I I I 

GAAACTGCGGACGGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCC 60 

AAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTG 311 

I | I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M II I I I I I I II I I I I I I I I I I I I 
AAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTG 120 

G G GAGAGGAGAGAGG CTT C C C GC CT GACAG GG C C ACT C C GCT T T T GCAAACC GC AG AGAT 371 

I I I I I I I I I I I I II I M I I I I I I I I I I I I I II I I I I I I I I I II I 

GGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGCAGAGAT 18 0 

AATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTC 431 

I | | I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 
AATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTC 240 

GT T GG C AC CT G C GGAGGT GC CT AAAGGAGAC AGGAC GGCAGGAT CT C C G C C AC G CAC CAT 4 91 

I | | I II I I I I I I I I I II I I I I I I I I I I II I I t I I I I I I I I I I I I I I I I I I M I I I I I II I 

GT T GGCAC CT G C GGAG GTGCCTAAAGGAGACAGGAC GGCAGGAT CTCCGC CAC GCAC CAT 300 
CTCCCCTCCCCCGTGC C AAGGAC C CAT C G AGAT CAAGGAGACT T T CAAAT ACAT CAAC AC 551 

I t I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I 

CTCCCCTCCCCCGT GC CAAG GAC C CAT C GAGAT CAAGGAGACT T T CAAAT ACAT CAAC AC 360 

GGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGAAT 611 

I | | | | I | M I I II I I I I I I I I I I I I M M I I I II I I I I I I I I I I I II I I I I I I I I I I I I I 
GGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCTGAGAAT 420 



Qy 


192 


Db 


1 


Qy 


252 


Db 


61 


Qy 


312 


Db 


121 


Qy 


372 


Db 


181 


Qy 


432 


Db 


241 


Qy 


492 


Db 


301 


Qy 


552 


Db 


361 



Qy 



612 TATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCT 671 
M I | I M I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I M I I I I I I I I I I I I I I I I I I I 



Db 


421 


TAT CT ACAAGAACAAGT G CAT GC GAAAC G GT C C CAAT AT CT T GAT C G C CAGCT T G G CT CT 


480 


Qy 


O / Z 


ppp ap apptp ptpp ap ATPPTPATTPrAP ATrrPTATPAATQTPTArAA^PTGCTGGCAGA 


731 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M M 1 1 




Db 


481 


G G GAGAC C T GC T GC AC AT C GT CAT T GAC AT C C CT AT CAAT GT CT ACAAGCT G CT GGC AGA 


540 


Qy 


/ oZ 


pp aptpppp atttpp A f^rTna^aTf^TnT A Af^TTt^GT GC CTTTCAT AT AHA A AGCCTCCGT 


791 




1 I | 1 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


GGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATACAGAAAGCCTCCGT 


600 


Qy 


""7 O O 

/ y z 


r , r , ("7\7\T'r'7V r , T>r*rpr , PTPA PTPT AT PT PPT PTP A PT ATTPAP AP AT ATPP APPTPrTTPPTTr 1 
(jobxAAl LAL Ibl obi o/\bx 1 ol bLl b x o/^o X t\± X o/\^.rt.o.tt. irti v^,o.tt.o^x on uV/ x x ^ 


851 




1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 II M II 1 1 1 1 1 1 1 1 1 




Db 


601 


GGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGTTGCTTC 


660 


Qy 


852 


T T GGAGT AGAATT AAAGGAAT T G GG GT T C CAAAAT GGAC AGC AGTAGAAAT T GTT T T GAT 


911 




M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


T T G GAGT AGAAT T AAAG GAAT T GGGGT T C CAAAAT GGACAGC AGTAGAAAT T GTT T T GAT 


720 


Qy 


912 


mmr , /"/^m/^/^rn/^"'T'r , rnr ,r npr , TTPTPPPTPTPPPT^ A A PPP AT APPTTTTPAT AT A ATT APPAT 
1 1 oool ool b 1 b 1 brl ool Ibl bbUl o 1 ULU 1 o/\rVobbtt.l /Vjo 1 1 1 X b/\i/il/in.l ±/\^.ort.i 


971 


Db 


721 


1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 

TTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAATTACGAT 


780 


Qy 


y /z 


nm\ rmAr' a 7\ app a A ptt a tptppp A ATPTPPTTPPTTP ATPPPPTTP AP A Af^AP AGfTTT 
bbAL 1 AnAbbAnb 1 1A1 b 1 blbnnl b i ob X X bb X X ^r\X LLLol X O^O.r\rto.rt.^.rtO^ xxx 


1031 




1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


GGACTACAAAGGAAGTTATCTGCGAATCTGCTTGCTTCATCCCGTTCAGAAGACAGCTTT 


840 


Qy 


1032 


nArnrp^rrpT'fTiT'nr'A nrarsrpaa A A P ATT PPT PPPTPTTP APTTTPT ATTTPTPPTTPPP 
bAl obAo 111 1 Ab AAbAb Abb AAAAbA 1 1 oo 1 oob loll b/\o llltl f\± x X ^ x oo x x ov_-o 


1091 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


841 


CATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCC 


900 


Qy 


1 092 


TvrTunrTPP A'PP7\prnr , PA r r r P r P r P r r r rTa r P7iPBr r T , J\ ATP A PPTPTP A A ATPTTP AP A A AP A A A APr 
Al 1 oobbA.1 bAb 1 ob/\l 11111 1 /\ 1 /\b/\b ±f\t\l ortLL ibl o>va/\i oil uAonnnumrtAu 


1151 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


ATTGGCCATCACTGCATTTTTTTATACACTAATGACCTGTGAAATGTTGAGAAAGAAAAG 


960 


Qy 


1152 


rnr*/^/^AT>r*r , 7\r*A r P r P/"r ,r PTT7\7iATPATPAPPTAA APP APAPAPPPPA APTPPPP AAA APPPT 
TbbbAi obA.oA.1 1 obi 1 1 /\Z\r\l o/\i bAbb l/w\ut/\bhij/\^ijuo/v\ol bb^^/vWi^^oi 


1211 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


961 


TGGCATGCAGATTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACCGT 


1020 


Qy 


1212 


r*mmmrpr , r , r , 'T'PPTPPTTPTPTTTPPPPTPTPPTP,CPTTPPPPTTP APPTP APtP AGPtATTCT 
bl 1 1 1 obb 1 oo 1 bb 1 IblLl 1 1 buLUl b 1 1 oob l 1 ^ ^ x X X o.rt.oo.tt.oo,ni x v-* x 


1271 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 




Db 


1021 


CTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCT 


1080 


Qy 


12 /Z 


r*7\ A/~ , r , TP aptpttt ata ATPAPAATPATPPPA AT AP ATPTPA APTTTTPAGPTTTCTGTT 
bxAAobl b/\b 1 bl 1 ±I\±J\r\l b/\orVr\l orVl bbb/\r\X.rVo.rt.X al una^ X X X X O-tt-OO x i ilioi i 


1331 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1081 


GAAGCTCACTCTTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAGCTTTCTGTT 


1140 


Qy 


1 o ^ o 


r'prp att pp a pt at ATT PPT ATP A A P ATPPPTT P APTP A ATT P PTPP ATT A ACCC AATTGC 
oo LJ\L 1 oo/\b I /\l J-Vl 1 bulnl br\r\br\l ubU X X 0/-yO X oru-VX 1 OO X o^Al x rtnu^^nni x oo 


1391 




1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


1141 


GGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGCATTAACCCAATTGC 


1200 


Qy 


1 O Q O 

lo y z 


TPTPTATTTPPTPAPPAAAAPATTPAAA A APTPPTTT A APTP AT PPT T ATPtPTGC'TGGTG 
Ibl b 1 1 1 ool bnbLnnnAbnl 1 b/\MJ"T-rV/\0 X 1 1 1 .rvrvvj i OrtX 1 lAi. oox vjvji o 


1451 




1 1 1 1 1 1 1 1 1 i 1 1 M II 1! 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 M 1 1 1 I 1 1 11 1 1 1 1 1 
1 1 1 1 M 1 1 M 1 M 1 M It M 1 1 11 1 M 1 M M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 i I i i i i 




Db 


1201 


TCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTGCTGGTG 


1260 


Qy 


1452 


CCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAAGC 


1511 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


1261 


CCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTTCAAAGC 


1320 



Qy 1512 T AAT GAT CAC G GAT AT GACAAC T T C C GT T C C AGT AAT AAAT AC AG CT CAT C T T GAAAGAA 1571 

| | I I I I I I I I I II I I I I I I I 11 I I II I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I M I I M 
Db 1321 T AAT GAT CAC G GAT AT GACAAC TT C C GT T C C AGT AAT AAAT ACAG CT CAT C T T GAAAGAA 1380 

Qy 1572 GAACT AT T CACT GT AT TT CAT T T T C T T TAT AT T GGAC C GAAGT C AT T AAAACAAAAT GAA 1631 

| | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 GAACT AT T CACT GT AT TT C AT T T T CT T TAT AT T GGAC C GAAGT CAT TAAAACAAAAT GAA 144 0 

Qy 1632 AC AT T T G C C AAAAC AAAAC AAAAAAC T AT G 1661 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 1441 AC AT T T G C C AAAAC AAAAC AAAAAAC TAT G 1470 



RESULT 12 
US-10-311-671-28 

Sequence 28, Application US/10311671 
Publication No. US20040072996A1 
GENERAL INFORMATION: 
APPLICANT: INCYTE GENOMICS, INC. 
APPLICANT: LAL, Preeti G. 
APPLICANT: BAUGHN, Marian R. 
APPLICANT: HAFALIA, April J. A. 
APPLICANT: NGUYEN, Danniel B. 
APPLICANT: GANDHI, Ameena R. 
APPLICANT: KALLICK, Deborah A. 
APPLICANT: GRIFFIN, Jennifer A. 
APPLICANT : YUE, Henry 
APPLICANT: KHAN, Farrah A. 
APPLICANT: ARVIZU, Chandra S. 
APPLICANT: LU, Dyung Aina M. 
APPLICANT: TRIBOULEY, Catherine M. 
APPLICANT: LU, Yan 
APPLICANT: CHAW LA, Narinder K. 
APPLICANT: GRAUL, Richard 
APPLICANT: YAO, Monique G. 
APPLICANT: YANG, Junming 
APPLICANT: RAMKUMAR, Jayalaxmi 
APPLICANT: AU-YOUNG, Janice K. 
APPLICANT: ELLIOTT, Vicki S. 
APPLICANT: HERNANDEZ, Roberto 
APPLICANT: WALSH, Roderick T. 
APPLICANT: BOROWSKY, Mark L. 
APPLICANT: THORNTON, Michael B . 
APPLICANT: HE, Ann 

TITLE OF INVENTION: G-PROTEIN COUPLED RECEPTORS 
FILE REFERENCE: PI-0131 USN 

CURRENT APPLICATION NUMBER: US/ 10/ 3 11 , 67 1 
CURRENT FILING DATE: 2002-12-16 
PRIOR APPLICATION NUMBER: PCT/US0 1/ 1 927 5 
PRIOR FILING DATE: 2001-06-15 
PRIOR APPLICATION NUMBER: 60/212,483 
PRIOR FILING DATE: 2000-06-16 
PRIOR APPLICATION NUMBER: 60/213,954 
PRIOR FILING DATE: 2000-06-22 
PRIOR APPLICATION NUMBER: 60/215,209 
PRIOR FILING DATE: 2000-06-29 



PRIOR APPLICATION NUMBER: 60/216,595 
PRIOR FILING DATE: 2000-07-07 
PRIOR APPLICATION NUMBER: 60/218,936 
PRIOR FILING DATE: 2000-07-14 
PRIOR APPLICATION NUMBER: 60/219,154 
PRIOR FILING DATE: 2000-07-19 
PRIOR APPLICATION NUMBER: 60/220,141 
PRIOR FILING DATE: 2000-07-21 
NUMBER OF SEQ ID NOS : 35 
SOFTWARE: PERL Program 
SEQ ID NO 28 
LENGTH: 1632 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/KEY: misc_feature 

OTHER INFORMATION: Incyte ID No: 6792419CB1 
US-10-311-671-28 

Query Match 32.3%; Score 1389; DB 12; Length 1632; 

Best Local Similarity 100.0%; Pred. No. 8.6e-275; 

Matches 1389; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 186 GGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCC 245 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 
Db 236 GGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCC 2 95 

Qy 246 GCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCG 305 

I I I I I I I 1 I I I I I I I I I I II I M I I II I I I II I I I I I I I I ! II I I I I I I I I I I I I I I I I I 
Db 296 GCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCG 355 

Qy 306 GATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGC 365 

I II I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 356 GATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGC 415 

Qy 366 AGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCT^ACGCCAGTCTGGC 425 

I I M II I I I II II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I II I I I I 
Db 416 AGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGC 475 

Qy 426 GCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACG 485 

I I I I I I I I I I I I I 1 I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 
Db 47 6 GCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACG 535 

Qy 48 6 CACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTCAAATACAT 545 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 536 C AC CAT CTCCCCTCCCCCGTGC C AAG G AC C CAT C GAG AT C AAG GAG AC T T T C AAAT AC AT 595 

Qy 546 CAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCT 605 

I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I II II I I I I I I M I I I I I 
Db 596 CAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACACTTCT 655 

Qy 606 GAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTT 665 

I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 656 GAGAATTATCTACAAGAACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTT 715 

Qy 666 GGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAGCTGCT 725 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I II I I 1 



Db 



716 G GCT CT GGGAGACC T GC T G C AC AT C GT CAT T GAC AT C C C TAT CAAT GT CT ACAAGCT G C T 775 



Qy 726 GGCAGAGGACT GGC CAT TT GGAGCT GAGAT GT GTAAGCT GGT GCCTTTCAT ACAGAAAGC 7 85 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I II I I I I I I I I I I I I I I I I I M I 
Db 776 G GC AGAGGACT GGC C AT T T G GAG CT GAGAT GT GTAAGCT GGTGCCTTT C AT AC AGAAAG C 835 

Qy 786 C T C C GT GG GAAT C ACT GT GCT GAGT CT AT GT G CT CT GAGT AT T GAC AGAT AT C GAG C T GT 845 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 836 CTCCGTGGGAATCACTGTGCTGAGTCTATGTGCTCTGAGTATTGACAGATATCGAGCTGT 8 95 

Qy 84 6 TGCTTCTT GGAGT AGAAT T AAAGGAAT T G GG GT T C CAAAAT GGACAG CAGT AGAAAT T GT 905 

I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M 
Db 896 TGCTTCTT GGAGT AGAATTAAAGGAATTGGGGTTCCAAAATGGACAGCAGTAGAAATTGT 955 

Qy 90 6 TTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAAT 965 

I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 956 TTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTTGATATAAT 1015 

Qy 966 T AC GAT GGACT ACAAAG GAAGT TAT CT GC GAAT CTGCTTGCTT CAT C C C GTT C AGAAGAC 1025 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1016 T AC GAT GGACT ACAAAG GAAGT TAT CT GCGAAT CT GCTT GCTT CAT CCCGTT CAGAAGAC 1075 

Qy 102 6 AGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTG 1085 

I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I II I I I I I I I I I I I 
Db 1076 AGCTTTCATGCAGTTTTACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTG 1135 

Qy 108 6 C T T G C CAT T GGC CAT C ACT GC AT T T T T TT AT AC AC TAAT GAC CT GT GAAAT GT T GAGAAA 1145 

I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 1136 C T T G C CAT T GGC CAT C ACT G CAT T T T T TT ATAC ACTAAT GAC CT GT GAAAT GT T GAGAAA 1195 

Qy 1146 GAAAAGT GG C AT GCAGAT T GCT T T AAAT GAT CAC C T AAAG CAGAGAC GGGAAGT GGC CAA 12 05 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1196 GAAAAGT GG CAT GCAGAT T GCT T T AAAT GAT CAC C T AAAG CAGAGAC GG GAAGT GGC CAA 1255 

Qy 1206 AACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAG 1265 

I I II I I II I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1256 AACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAG 1315 

Qy 1266 GATT CT GAAGCT CACT CTTT ATAAT CAGAATGAT C CCAAT AGAT GT GAACTTT T GAGCTT 1325 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I II I I I I I I 
Db 1316 GATT CT GAAGCT CACT CTTT ATAAT CAGAAT GAT C CCAAT AGAT GT GAACTTTT GAGCTT 1375 

Qy 132 6 TCT GTT GGTATTGGACTATATTGGTATCAACATGGCTT CACT GAATTCCTGCATTAACCC 1385 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1376 T CT GT T GGT AT T G GACT AT AT T G GT AT CAACAT GGCT T CAC T GAAT T CCT G CAT T AAC C C 1435 

Qy 1386 AATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTG 1445 

II I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I M I I 

Db 1436 AATTGCTCTGTATTTGGTGAGCAAAAGATTCAAAAACTGCTTTAAGTCATGCTTATGCTG 1495 

Qy 1446 CTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGCTTAAAGTT 1505 

I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 14 96 CT GGT GCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGT GCTT AAAGTT 1555 

Qy 1506 C AAAG CT AAT GAT CAC GGAT AT GACAACT T C C GT T C C AGT AAT AAAT AC AGC T CAT CT T G 1565 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I 
Db 1556 CAAAGCT AAT GAT CAC G GAT AT GACAACTT C C GT T C CAGT AAT AAAT AC AGC T CAT CT T G 1615 



Qy 



1566 AAAGAAGAA 1574 



Db 



1616 AAAGAAGAA 1624 



RESULT 13 



US-09-826-509-496 

; Sequence 496, Application US/09826509 

; Publication No. US20030204073A1 

; GENERAL INFORMATION: 

; APPLICANT: Lehmann-Bruinsma , Karin 
; APPLICANT: Liaw, Chen W. 
; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: No. US20030204073A1-Endogenous , Cons titutively Activated 



; TITLE OF INVENTION: Protein-Coupled Receptors 
; FILE REFERENCE: AREN-2 07 

; CURRENT APPLICATION NUMBER: US/ 09/ 82 6, 509 

; CURRENT FILING DATE: 2001-04-05 

; PRIOR APPLICATION NUMBER: 60/195,747 

; PRIOR FILING DATE: 2000-04-07 

; PRIOR APPLICATION NUMBER: 09/170,496 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 589 

; SOFTWARE: Patentln Version 2.1 

; SEQ ID NO 496 

LENGTH: 1329 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-826-509-496 

Query Match 30.8%; Score 1322.6; DB 11; Length 1329; 

Best Local Similarity 99.7%; Pred. No. 3.3e-261; 

Matches 1325; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 

Qy 238 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 297 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGCAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGC 60 

Qy 298 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTG 357 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 
Db 61 CTGTCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCCGACAGGGCCACTCCGCTTTTG 120 

Qy 358 CAAAC C GC AGAGAT AAT GAC GC C AC C CACT AAGAC CT T AT G GC C CAAGGGT T C CAAC G C C 417 



Known G 




Db 



121 CAAAC C G C AGAGAT AAT GAC GC C AC C CACT AAGACCT TAT G GC C CAAGG GT T C CAAC G C C 180 



418 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 477 




Db 



181 AGTCTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCT 240 



Qy 



478 CCGCCACGCACCATCTCCCCTCCCCCGTGCCAAGGACCCATCGAGATCAAGGAGACTTTC 537 




Db 



241 C C GC CAC GCAC CAT CTCCCCTCCCCCGTGC CAAGGAC C CAT C GAGAT CAAG GAG ACT T T C 300 



Qy 



538 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 597 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 

Db 301 AAATACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCC 360 

Qy 598 AC ACT T C T GAGAAT TAT CT ACAAGAAC AAGT GCAT GC GAAAC GGT C C CAAT AT CT T GAT C 657 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 361 AC ACT T CT GAGAAT TAT CT ACAAGAAC AAGT GCAT GC GAAAC GGT C C CAAT AT C T T GAT C 420 

Qy 658 G C C AGCT T G GC T CT GGGAGAC CT GCT GC ACAT C GT CAT T GACAT C C CT AT CAAT GT CT AC 717 

I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GC CAGCT T G GC T CT GGGAGAC CT G CT GCACATCGT CAT T GACAT C C CT AT CAAT GTCT AC 480 

Qy 718 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 777 

I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 AAGCTGCTGGCAGAGGACTGGCCATTTGGAGCTGAGATGTGTAAGCTGGTGCCTTTCATA 540 

Qy 778 C AGAAAG C CT C C GT G GGAAT C ACT GT GCT GAGT CT AT GT G CT CT GAGT AT T GAC AGAT AT 837 

I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 CAGAAAGC CTCC GT GGGAAT CACT GT GCT GAGT C TAT GT GCT CT GAGTATTGACAGATAT 600 

Qy 838 C GAG C T GT T GCT T C T T GGAGT AGAAT T AAAG GAAT T GGGGT T C CAAAAT GGAC AG C AGT A 897 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 C GAG CT GT T GC T T CT T GGAGTAGAATTAAAGGAATT GGGGT TC CAAAAT GGAC AG C AGT A 660 

Qy 8 98 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 957 

I I I I I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I 
Db 661 GAAATTGTTTTGATTTGGGTGGTCTCTGTGGTTCTGGCTGTCCCTGAAGCCATAGGTTTT 72 0 

Qy 958 GAT AT AAT T AC GAT G GACT ACAAAG GAAGTT AT CT GC GAAT CT G CT T GC T T CAT C C C GT T 1017 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I I I I I I M II I I I I I I 
Db 721 GAT AT AAT T AC GAT GGACT ACAAAG GAAGTT AT CT G C GAAT CTGCTTGCTT CAT C C C GT T 780 

Qy 1018 C AGAAGACAGCT TT CAT GC AGT T T T ACAAGAC AG CAAAAGAT T GGT GGCT GT T C AGT T T C 1077 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 C AGAAGACAGCT TT CAT GC AGT T T T ACAAGAC AGCAAAAGAT T GGT GGCT GT T C AGT T T C 840 

Qy 107 8 TAT TTCTGCTTGC C ATT G GC C AT CACT GCAT T T T T T T AT AC ACTAAT GAC CT GT GAAAT G 1137 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I II I I I I 
Db 841 TAT T T C T GCTT GC CAT T GGC CAT C AC T GCAT T T T T T TAT AC ACTAAT GAC CT GT GAAAT G 900 

Qy 1138 TTGAGAAAGAAAAGT GGCAT GCAGATT GCTTTAAAT GAT CACCTAAAGCAGAGAC GGGAA 1197 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 T T GAGAAAGAAAAGT GGC AT GC AGAT T GC T T T AAAT GAT CAC C T AAAGCAGAG AC GG GAA 960 

Qy 1198 GTGGCCAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1257 

IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 GTGAAGAAAACCGTCTTTTGCCTGGTCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCAC 1020 

Qy 1258 CT C AGCAG GAT T CT GAAGC T CACT C T T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T 1317 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I M I 
Db 1021 CT C AGC AGGAT T CT GAAG CT CACT CT T TAT AAT CAGAAT GAT C C CAAT AGAT GT GAACT T 1080 

Qy 1318 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGT^ATTCCTGC 1377 

I I I I I I I I I I I I I I I I I II I I II I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 1081 TTGAGCTTTCTGTTGGTATTGGACTATATTGGTATCAACATGGCTTCACTGAATTCCTGC 1140 



Qy 



1378 AT T AAC C CAAT T GC T CT GT AT T T GGT GAG CAAAAGAT T CAAAAACT GCTT T AAGT CAT G C 1437 
I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I II I M I I I I I I I I I M I 



Db 



1141 AT TAAC C CAAT T GC T CT GT AT T T G GT GAGC AAAAGAT T CAAAAAC T GCT T T AAGT CAT GC 1200 



Qy 1438 T TAT GCT G CT G GT G CCAGT CAT T T GAAGAAAAAC AGT C C T T G GAGGAAAAGC AGT CGT G C 1497 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1201 TTATGCTGCTGGTGCCAGTCATTTGAAGAAAAACAGTCCTTGGAGGAAAAGCAGTCGTGC 1260 

Qy 14 98 T T AAAGT T CAAAG C TAAT GAT C AC GGAT AT GACAAC T T C C GT T C CAGT AAT AAAT AC AGC 1557 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12 61 T T AAAGT T CAAAG C TAAT GAT C AC GGAT AT GACAAC T T C C GT T C CAGT AAT AAAT AC AGC 1320 

Qy 1558 TCATCTTGA 1566 

I I I I I I I I I 
Db 1321 TCATCTTGA 132 9 



RESULT 14 
US-10-235-192A-32 

Sequence 32, Application US/10235192A 
Publication No. US20040043389A1 
GENERAL INFORMATION: 
APPLICANT: McCarthy, Jeanette 

TITLE OF INVENTION: Methods and Compositions for Identifying 

TITLE OF INVENTION: Risk Factors for Abnormal Lipid Levels and the Diseases 
TITLE OF INVENTION: and Disorders Associated Therewith 
FILE REFERENCE: MMI-011 

CURRENT APPLICATION NUMBER: US/10/235, 192A 
CURRENT FILING DATE: 2002-09-04 
NUMBER OF SEQ ID NOS : 49 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 32 
LENGTH: 1578 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-235-192A-32 

Query Match 28.4%; Score 1220.4; DB 13; Length 1578; 

Best Local Similarity 99.8%; Pred. No. 3.6e-240; 

Matches 1228; Conservative 0; Mismatches 2; Indels 1; Gaps 1; 

Qy 203 GCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCAAGTCTGTGCG 2 62 

I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I 
Db 200 GCGGCCACCGGACG-CTTCTGGAGCAGGTAGCAGCATGCAGCCGCCTCCAAGTCTGTGCG 258 

Qy 263 GACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGGGGAGAGGAGA 322 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II 
Db 259 GACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTGTCGCGGATCTGGGGAGAGGAGA 318 

Qy 323 GAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAAACCGCAGAGATAATGACGCCAC 382 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 319 GAG GCT TCCCGCCC GAC AGGGC C AC TCCGCTTTT GC AAAC C G CAGAGAT AAT GAC GC C AC 378 

Qy 383 CCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTCGTTGGCACCTG 442 

I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M M I I I II 
Db 37 9 CCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGTCTGGCGCGGTCGTTGGCACCTG 438 

Qy 443 CGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCATCTCCCCTCCCC 502 

I I I I I I I I I I I I I I II I I I I I II I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I I I I 



Db 439 CGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCGCCACGCACCATCTCCCCTCCCC 4 98 

Qy 503 C GT G C C AAG GAG C CAT C G AGAT C AAG G AGAC T T T C AAAT AC AT C AAC AC GGTTGTGTCCT 562 

I I I I I I ! I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 4 99 C GT G C CAAG GAC C CAT C GAGAT CAAGGAGACT T T CAAAT ACAT CAAC AC GGT T GT GT C C T 558 

Qy 563 GCCTTGTGTTCGTGCT G GG GAT CAT C G G GAACT C CAC ACT T C T GAGAAT TAT CT AC AAGA 622 

II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II 

Db 559 G C C T T GT GT T C GT GC T GG G GAT CAT C GGGAAC T C CAC AC T T CT GAGAAT TAT CT ACAAGA 618 

Qy 623 ACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGC 682 

I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I II 
Db 619 ACAAGTGCATGCGAAACGGTCCCAATATCTTGATCGCCAGCTTGGCTCTGGGAGACCTGC 67 8 

Qy 683 T GC AC AT C GT CAT T GACAT C C CT AT CAAT GT CT ACAAGCT GCT G GCAGAG GACT GGC CAT 742 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I II 
Db 67 9 T GCACAT C GT CAT T GACAT C C CT AT CAAT GT CT ACAAGCT G CT GGC AGAGGAC T G GC CAT 738 

Qy 743 T T G GAGCT GAGAT GT GT AAG CTG GT GC CT T T CAT AC AGAAAGC C T C CGT G GGAAT CACT G 8 02 

I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 739 T TGGAGCT GAGAT GT GTAAGCTGGT GCCTTT CAT AC AGAAAGC CT C CGT GGGAAT CACT G 798 

Qy 8 03 T GCT GAGT CT ATGT GCTCT GAGTATT GACAGAT AT C GAGCT GTT GCTT CTT GGAGTAGAA 8 62 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 799 T GC T GAGT CT AT GT GCTCT GAGT AT T GACAGAT AT C GAGCT GTTGCTTCTT GGAGTAGAA 858 

Qy 863 T T AAAGGAAT T GGGGT T C CAAAAT G GAC AG CAGT AGAAAT T GT TT T GAT T T GG GT GGT C T 922 

II I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I 

Db 859 T TAAAG GAAT T GG GGT T C CAAAAT GGACAGCAGT AGAAAT T GT T T T GATT T GGGT GGT CT 918 

Qy 923 CTGTGGTTCTG GC T GT C C CT GAAGC C AT AGGTT T T GAT ATAAT T AC GAT GGACT ACAAAG 982 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 919 CT GT GGT T CT GGCT GT CC CT GAAGC CAT AGGTT T T GAT ATAAT T AC GAT GGAC T ACAAAG 97 8 

Qy 983 GAAGT T AT CT GC GAAT CTGCTTGCTT CAT C C CGT T C AGAAGACAGCT T T C AT GC AGTTT T 1042 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 979 GAAGTT AT CT G C GAAT CT G CT T GCTT CAT C C CGT T CAGAAGACAG CT T T C AT GC AGTT T T 1038 

Qy 1043 ACAAGAC AG CAAAAGAT TGGTGGCT GTT CAGTT T CT AT TTCTGCTT GC C AT T G G C CAT C A 1102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 1039 ACAAGACAGCAAAAGATTGGTGGCTGTTCAGTTTCTATTTCTGCTTGCCATTGGCCATCA 1098 

Qy 1103 CTG CAT T T T T T T AT ACACT AAT GAC CT GT GAAAT GT T GAGAAAGAAAAGT GGCAT GC AGA 1162 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1099 CTG CAT T T T T T TAT AC ACT AAT GAC CT GT GAAAT GTT GAGAAAGAAAAGT GGCAT G C AGA 1158 

Qy 1163 TTGCTTTAAATGATCACCTAAAGCAGAGACGGGAAGTGGCCAAAACCGTCTTTTGCCTGG 1222 

II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I II I I I I II I I I I I I I I I I I I 

Db 1159 T T GCT T TAAAT GAT C AC CT AAAG C AGAGAC G GGAAGT GGC CAAAAC CGTCTTTTGCCTGG 1218 

Qy 1223 TCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCTCACTC 1282 

I I I I I I I I I I I I I I I I I I I I I II 1 I I I I I I I I I I I I I I I I I I I I I II I I 1 I I I I I I I I I I 
Db 1219 TCCTTGTCTTTGCCCTCTGCTGGCTTCCCCTTCACCTCAGCAGGATTCTGAAGCTCACTC 127 8 

Qy 12 83 TTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAGCTTTCTGTTGGTATTGGACT 1342 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 127 9 TTTATAATCAGAATGATCCCAATAGATGTGAACTTTTGAGCTTTCTGTTGGTATTGGACT 1338 



Qy 1343 AT AT T G GT AT C AAC AT GGCT T CAC T GAAT T C CT G C ATT AAC CCAAT T GC T CT GT AT T T GG 14 02 

I I I I I I I I I I M I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 1339 AT AT T G GT AT CAACAT GGCT T CAC T GAAT T C CT G C ATT AAC CCAAT T GC T CT GT AT T T GG 1398 

Qy 14 03 T GAGCAAAAGATTCAAAAACT GCTTTAAGT C 1433 

I I I I I I I I I I II I II I M I II I I I I I I I I I 
Db 1399 T GAG C AAAAG AT T C AAAAAC T G C T T T AAG G C 1429 



RESULT 15 
US-09-778-927A-27 

Sequence 27, Application US/09778927A 
Patent No. US20020068342A1 
GENERAL INFORMATION: 
APPLICANT: KHOSRAVI, Rami et al . 

TITLE OF INVENTION: NOVEL NUCLEIC ACID AND AMINO ACID SEQUENCES AND NOVEL 
TITLE OF INVENTION: VARIANTS OF ALTERNATIVE SPLICING 
FILE REFERENCE: 2786-0160P 

CURRENT APPLICATION NUMBER: US/ 0 9/77 8 , 927A 
CURRENT FILING DATE: 2001-02-08 
PRIOR APPLICATION NUMBER: IL 134453 
PRIOR FILING DATE: 2000-02-09 
PRIOR APPLICATION NUMBER: IL135341 
PRIOR FILING DATE: 2000-03-29 
NUMBER OF SEQ ID NOS : 81 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 27 
LENGTH: 8 00 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/KEY: misc_feature 
LOCATION: (1) . . (800 ) 

OTHER INFORMATION: n = a,c,g,t any unknown or other 
US-09-778-927A-27 

Query Match 17.7%; Score 763.2; DB 9; Length 800; 

Best Local Similarity 98.3%; Pred. No. 2e-146; 

Matches 771; Conservative 0; Mismatches 13; Indels 0; Gaps 0; 

Qy 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

I I II I I I I I I I I II I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II 
Db 1 GAGACATTCCGGTGGGGGACTCTGGCCAGCCCGAGCAACGTGGATCCTGAGAGCACTCCC 60 

Qy 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

I I I I I I II I I I I I I I I I I I II II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I 
Db 61 AGGTAGGCATTTGCCCCGGTGGGACGCCTTGCCAGAGCAGTGTGTGGCAGGCCCCCGTGG 120 

Qy 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 18 0 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AGGATCAACACAGTGGCTGAACACTGGGAAGGAACTGGTACTTGGAGTCTGGACATCTGA 180 

Qy 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 181 AACTTGGCTCTGAAACTGCGGAGCGGCCACCGGACGCCTTCTGGAGCAGGTAGCAGCATG 240 



Qy 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 300 

I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I | | I I I 
Db 241 CAGCCGCCTCCAAGTCTGTGCGGACGCGCCCTGGTTGCGCTGGTTCTTGCCTGCGGCCTG 30 0 

Qy 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I | | I II 
Db 301 TCGCGGATCTGGGGAGAGGAGAGAGGCTTCCCGCCTGACAGGGCCACTCCGCTTTTGCAA 360 

Qy 361 AC C GCAGAGATAAT GAC GC C AC C C ACT AAGAC CT TAT GG C C C AAGGGT T C CAAC GC C AGT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I II I I I I I I I I 
Db 361 ACCGCAGAGATAATGACGCCACCCACTAAGACCTTATGGCCCAAGGGTTCCAACGCCAGT 42 0 

Qy 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II M I I 
Db 421 CTGGCGCGGTCGTTGGCACCTGCGGAGGTGCCTAAAGGAGACAGGACGGCAGGATCTCCG 480 

Qy 481 CC ACGC AC CAT CTCCCCTCCCCCGTGC CAAG GAC C CAT CGAGAT CAAG GAGACTT T CAAA 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 4 81 C C AC GC AC C AT CT CCCCTCCCCCGT GC CAAGGAC C CAT CGAGAT CAAGGAGACT T T CAAA 540 

Qy 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I 
Db 541 TACATCAACACGGTTGTGTCCTGCCTTGTGTTCGTGCTGGGGATCATCGGGAACTCCACA 600 

Qy 601 C T T CT GAGAAT TAT CT ACAAGAACAAGT G CAT G C GAAAC G GT C C CAAT AT CT T GAT C GC C 660 

I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 601 CT T CT GAGAAT TAT C T ACAAGAACAAGT GC AT GC GAAAC GGT C C CAAT AT C T T GAT C G C C 660 

Qy 661 AGC T T G GCT CT GG GAGAC C T G CT G C AC AT C GT CAT T GACAT C C CT AT CAAT GT CT AC AAG 720 

I I I II I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I 
Db 661 AGCTTGGCTCTGGGAGACCTGCTGCACATCGTCATTGACATCCCTATCAATGTCTACAAG 720 

Qy 721 CT G CT G GCAGAGG ACT GGC CAT T T GGAGC T GAGAT GT GTAAG CT GGT G C CT T T CAT ACAG 780 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I || 
Db 721 C T G CT GGCAGAG GACT GGC CAT T T G GAG CT GAGAT GT GC C AGGT AGGAG C GT T CAC C CAC 780 

Qy 781 AAAG 784 

I I 

Db 781 CCAG 784 
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